
mi1_13_RL-Q
Authored by MI Team
Science
University
Used 47+ times

AI Actions
Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...
Content View
Student View
10 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
10 sec • Ungraded
Did you attempt the last exercise sheet?
Yes, I finally did
No )-:
2.
MULTIPLE CHOICE QUESTION
20 sec • 1 pt
Which Q-value is highest for the tile in the center (the smurf is the agent)?
Q(x, →)
Q(x, ←)
Q(x, ↓)
Q(x, ↑)
3.
MULTIPLE CHOICE QUESTION
20 sec • 1 pt
An MDP implies...
I know all possible states beforehand
model-based learning
model-free evaluation
policy iteration
4.
MULTIPLE CHOICE QUESTION
20 sec • 1 pt
Which provides the most direct way for extracting the optimal policy π*?
V* = argmax Vπ
Q* = argmax Qπ
neither
what does the * mean?
5.
MULTIPLE CHOICE QUESTION
20 sec • 1 pt
SARSA stands for
some small constant
state action reward state action
such a small reward so awful
StAte Reward StAte
6.
MULTIPLE CHOICE QUESTION
10 sec • 1 pt
The optimal policy π* is unique (T/F)
True
False
7.
MULTIPLE CHOICE QUESTION
10 sec • 1 pt
With Q-values, the optimal policy π* becomes unique (T/F)
True
False
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?