Search Header Logo

mi1_13_RL-Q

Authored by MI Team

Science

University

Used 37+ times

mi1_13_RL-Q
AI

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

    Content View

    Student View

10 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

10 sec • Ungraded

Did you attempt the last exercise sheet?

Yes, I finally did

No )-:

2.

MULTIPLE CHOICE QUESTION

20 sec • 1 pt

Media Image

Which Q-value is highest for the tile in the center (the smurf is the agent)?

Q(x, →)

Q(x, ←)

Q(x, ↓)

Q(x, ↑)

3.

MULTIPLE CHOICE QUESTION

20 sec • 1 pt

An MDP implies...

I know all possible states beforehand

model-based learning

model-free evaluation

policy iteration

4.

MULTIPLE CHOICE QUESTION

20 sec • 1 pt

Which provides the most direct way for extracting the optimal policy π*?

V* = argmax Vπ

Q* = argmax Qπ

neither

what does the * mean?

5.

MULTIPLE CHOICE QUESTION

20 sec • 1 pt

Media Image

SARSA stands for

some small constant

state action reward state action

such a small reward so awful

StAte Reward StAte

6.

MULTIPLE CHOICE QUESTION

10 sec • 1 pt

The optimal policy π* is unique (T/F)

True

False

7.

MULTIPLE CHOICE QUESTION

10 sec • 1 pt

With Q-values, the optimal policy π* becomes unique (T/F)

True

False

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Google

Continue with Google

Email

Continue with Email

Classlink

Continue with Classlink

Clever

Continue with Clever

or continue with

Microsoft

Microsoft

Apple

Apple

Others

Others

Already have an account?

Discover more resources for Science