Deep Learning: RNNs

Deep Learning: RNNs

5th Grade

6 Qs

quiz-placeholder

Similar activities

Logic Gates #1

Logic Gates #1

KG - 12th Grade

10 Qs

Computer Hardware - Logic Gates

Computer Hardware - Logic Gates

5th - 12th Grade

10 Qs

Truth Tables

Truth Tables

5th - 12th Grade

10 Qs

codes

codes

KG - University

8 Qs

Digital Logic - simple

Digital Logic - simple

1st - 12th Grade

10 Qs

Small Basic (Turtle)

Small Basic (Turtle)

KG - University

10 Qs

Understanding NLP and Transformers

Understanding NLP and Transformers

5th Grade

9 Qs

Computer Hardware - Truth Tables

Computer Hardware - Truth Tables

5th - 12th Grade

6 Qs

Deep Learning: RNNs

Deep Learning: RNNs

Assessment

Quiz

Computers

5th Grade

Medium

Created by

Josiah Wang

Used 6+ times

FREE Resource

6 questions

Show all answers

1.

MULTIPLE SELECT QUESTION

1 min • 1 pt

Which of the following might help prevent vanishing gradients in recurrent neural networks:

Using a gated recurrent network such as an LSTM

Using ReLU activations

Using more layers in your RNN

None of the above

Answer explanation

Gradient vanishing in vanilla recurrent neural networks is caused because of back-propagation through-time (BPTT) and small values of the weights which make the gradients tend to zero when modelling long-term sequences thanks to repeated application of the chain rule. It has been shown experimentally that using ReLU activations can reduce this effect. One can also use GRUs or LSTM units, which contain gates and channels allowing for the model to select gradients to flow back.

2.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of the following is True when implementing an RNN with language data:

All sentences in the training data must be padded so that they are the same length

We only need to use padding within a mini batch so that sentences in the same mini batch have the same length

Answer explanation

We can adapt the depth of the BPTT algorithm whenever a new mini-batch is sampled from the dataset. Thus, it is not required to pad all the sentences in the training data. However, the inputs within the same mini-batch have to have the same size and therefore padding is required in this case (within the mini-batch).

3.

MULTIPLE SELECT QUESTION

1 min • 1 pt

Which of the following are True about GRUs/LSTMs/vanilla RNNs:

LSTMs are likely to perform better on hard tasks with long-distance relations compared to vanilla RNNs

A GRU has more parameters than an LSTM

A GRU is likely to converge quicker than a LSTM

An RNN is likely to eventually outperform a GRU

Answer explanation

Contrary to GRUs or vanilla RNNs, LSTMs have two inner embeddings: the cell state and the hidden state. The former aims to capture long-term relations while the latter aims to capture the short-term relations present in the sequence.

An LSTM has 2 extra gates compared to GRU (forget gate and output gate). Therefore, a GRU unit has less parameters than an LSTM.

Taking into account the previous premise, it is reasonable to assume that in general a GRU will converge faster than an LSTM since it has less parameters to optimise.

A vanilla RNN unlikely to outperform a GRU if we take into account the gradient vanishing problem, which is addressed to some extent in GRUs.

4.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

True or False: Vanishing gradients are more likely to be an issue with longer sentences

True

False

Answer explanation

Longer sequence results in more applications of the chain rule in BPTT, therefore more opportunity for the gradients to vanish.

5.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

True or False: a BiLSTM has the same number of parameters as an LSTM

False

True

Answer explanation

A Bidirectional LSTM trains two LSTM networks: one models the sequence exactly as a traditional LSTM would do, and the other one models the sequence going backwards. Therefore, the number of parameters that a BiLSTM has is twice as much as a LSTM.

6.

MULTIPLE SELECT QUESTION

1 min • 1 pt

Which of the following are gates used in an LSTM:

Forget gate

Input gate

Output gate

Memory gate

Answer explanation

Media Image

Discover more resources for Computers