ML B2 CH8

ML B2 CH8

University

10 Qs

quiz-placeholder

Similar activities

ReflectiveQuiz_03092020

ReflectiveQuiz_03092020

University

10 Qs

Modul 1 Tipe B

Modul 1 Tipe B

University

10 Qs

MODUL 10 SISTEM DIGITAL

MODUL 10 SISTEM DIGITAL

University

15 Qs

DIGITAL

DIGITAL

University

10 Qs

KONSEP DASAR TIK

KONSEP DASAR TIK

University

12 Qs

DL_Unit-4

DL_Unit-4

University

13 Qs

Quiz Microprogramming

Quiz Microprogramming

University

15 Qs

Digital Logic Quiz

Digital Logic Quiz

University

10 Qs

ML B2 CH8

ML B2 CH8

Assessment

Quiz

Computers

University

Medium

Created by

Jhonston Benjumea

Used 1+ times

FREE Resource

10 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the main limitation of the traditional seq2seq model?
It cannot translate short sentences
It uses too many output layers
It encodes input into a fixed-length vector regardless of sentence length
It requires bidirectional input only

Answer explanation

Traditional seq2seq models compress all input into a fixed-length vector, which limits performance on longer inputs.

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What does the attention mechanism allow the decoder to use?
Only the last hidden state
Only encoder output embeddings
All encoder hidden states with different importance
Static word vectors

Answer explanation

Attention allows the decoder to focus on all encoder hidden states, assigning different weights to each.

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

How is the context vector in attention calculated?
Average of all vectors
Maximum value among vectors
Weighted sum of encoder outputs
Concatenation of last and first vectors

Answer explanation

The context vector is a weighted sum of all encoder output vectors based on attention weights.

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Which mathematical operation is used to compute attention scores?
Matrix inverse
Subtraction
Dot product
Hadamard product

Answer explanation

Attention scores are calculated using the dot product between decoder state and encoder outputs.

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the role of the softmax function in attention?
To generate output probabilities
To normalize attention scores into weights
To embed characters
To reduce vocabulary size

Answer explanation

Softmax normalizes raw attention scores into a probability distribution over encoder outputs.

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What does a Time Attention layer do?
Calculates self-attention only
Aggregates attention across all time steps
Forgets context information
Adds noise to encoder outputs

Answer explanation

Time Attention layers apply attention mechanisms at each time step of the sequence.

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is bidirectional RNN used for in attention-based models?
To skip decoding steps
To allow input data to flow only backward
To consider context from both directions of the input
To reduce computation time

Answer explanation

Bidirectional RNNs allow the model to gather information from both past and future inputs.

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?