
Week 4 Quiz 1 - Key concepts on Deep Neural Networks
Authored by Mau-Luen Tham
Science
University
Used 1+ times

AI Actions
Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...
Content View
Student View
6 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What is the "cache" used for in our implementation of forward propagation and backward propagation?
It is used to cache the intermediate values of the cost function during training.
We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
2.
MULTIPLE SELECT QUESTION
30 sec • 1 pt
Among the following, which ones are "hyperparameters"? (Check all that apply.) I only list correct options.
size of the hidden layers n[l]
learning rate α
number of iterations
number of layers L in the neural network
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Which of the following statements is true?
The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L.
True
False
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it.
True
False
6.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
There are certain functions with the following properties: (i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network.
True
False
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?