Search Header Logo

ML B2 CH4

Authored by Jhonston Benjumea

Computers

University

Used 1+ times

ML B2 CH4
AI

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

    Content View

    Student View

10 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is a major bottleneck of the original CBOW model?

The size of the input vector
One-hot encoding of the context
Large matrix calculations with softmax over huge vocabulary
Too few context words

Answer explanation

The original CBOW model uses softmax across a large vocabulary, which is computationally expensive.

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the function of the embedding layer in word2vec?

Compress the entire matrix into a scalar
Skip one-hot encoding and extract a word's vector directly
Generate audio features
Apply dropout to context words

Answer explanation

Embedding layers replace the need for one-hot vectors by selecting specific word vectors directly from the matrix.

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the main goal of negative sampling in word2vec?

To sample only frequent words
To train the model faster using binary classification
To filter out correct labels
To convert softmax into sigmoid

Answer explanation

Negative sampling simplifies training by converting the multi-class problem into several binary decisions.

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

In negative sampling, what kind of output do we expect for negative examples?

Close to 1
Exactly 1
Close to 0
Negative values

Answer explanation

Negative examples should result in output values close to 0, indicating they are not the correct context.

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Why is the sigmoid function used in binary classification for word2vec?

It simplifies matrix multiplication
It outputs discrete values only
It provides probabilities between 0 and 1
It reduces the training data size

Answer explanation

Sigmoid functions output values between 0 and 1, ideal for interpreting binary classification probabilities.

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What does the cross-entropy error measure in binary classification?

How long training takes
The difference between output probability and the correct label
The number of samples per class
The angle between word vectors

Answer explanation

Cross-entropy measures the loss based on the distance between predicted probabilities and actual labels.

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What technique helps select which negative examples to use?

Random dropout
Context padding
Probability-based sampling
Gradient descent

Answer explanation

Negative examples are sampled based on their frequency using probability-based techniques.

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Google

Continue with Google

Email

Continue with Email

Classlink

Continue with Classlink

Clever

Continue with Clever

or continue with

Microsoft

Microsoft

Apple

Apple

Others

Others

Already have an account?