ML Chapter 06

ML Chapter 06

University

15 Qs

quiz-placeholder

Similar activities

HCI - UCD

HCI - UCD

University

20 Qs

Ilustrator (mid)

Ilustrator (mid)

University

16 Qs

CSC 2663 - Database administrative Functions

CSC 2663 - Database administrative Functions

University

13 Qs

Layered Network Models

Layered Network Models

University

10 Qs

ML Course Activity-II

ML Course Activity-II

University

10 Qs

Latex

Latex

University

11 Qs

NEO LMS Training

NEO LMS Training

University

10 Qs

c.i.s.c.o

c.i.s.c.o

University

20 Qs

ML Chapter 06

ML Chapter 06

Assessment

Quiz

Computers

University

Practice Problem

Medium

Created by

Jhonston Benjumea

Used 1+ times

FREE Resource

AI

Enhance your content in a minute

Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...

15 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What does SGD stand for in neural network training?

Soft Gradient Descent
Stochastic Gradient Descent
Strong Graph Derivative
Semi-Gain Depth

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the main idea behind Stochastic Gradient Descent (SGD)?

Using the full dataset for every update
Adding randomness to initialization
Updating weights using small random batches
Freezing weights during training

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What problem does the Momentum method solve in SGD?

Overfitting
Vanishing gradient
Oscillations in gradient updates
Data imbalance

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

How does AdaGrad adjust the learning rate?

Keeps it constant
Increases it exponentially
Adapts it for each parameter based on past gradients
Resets it every epoch

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the main feature of the Adam optimizer?

Ignores momentum
Uses only recent gradients
Combines Momentum and AdaGrad
Requires no tuning

6.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Why is initializing weights with a standard deviation of 0.01 sometimes problematic?

It slows down learning
It may cause vanishing gradients
It improves generalization
It speeds up convergence too much

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

What is the Xavier initialization designed for?

ReLU activations
Linear regression
Layers with sigmoid/tanh activations
Binary classification

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Google

Continue with Google

Email

Continue with Email

Classlink

Continue with Classlink

Clever

Continue with Clever

or continue with

Microsoft

Microsoft

Apple

Apple

Others

Others

Already have an account?

Discover more resources for Computers