Exam Questions

Exam Questions

Assessment

Flashcard

Mathematics

KG

Hard

Created by

Wyatt Beals

FREE Resource

Student preview

quiz-placeholder

16 questions

Show all answers

1.

FLASHCARD QUESTION

Front

Back

SVMs don’t aim for the sharpest slope of the separating hyperplane—they aims for the maximum margin, meaning the boundary that’s as far as possible from the closest points (support vectors). A steep slope might fit the training data too tightly, and keep training points very close to the margin, limiting generalizability. (Either of these answers is acceptable for full credit.)

2.

FLASHCARD QUESTION

Front

In my lab, we study the use of gestures in collaborative problem solving tasks. We gathered 4 hours of audio-visual data, consisting of 10 groups each of 3 people collaboratively solving a problem involving physical objects, and developed a random forest method to detect when any participant in the data is performing a gesture of interest (such as pointing, pinching, or grabbing). There are two possible ways we could evaluate this gesture classifier: 1 (a) Pool the samples, randomly shuffle them, and split them into 10 folds and perform a rotating stratified 90:10 10-fold cross-validation. (b) Perform a rotating stratified 10-fold cross-validation using each group in turn as the test group.

Which of these is a better way to evaluate if I want to understand the robustness of my classifier to unseen data, and why?

Back

(b) is better, because our method needs to work on entire groups that it has not seen during training. Pooling the samples as in (a) means that in each fold, the model is likely to see at least some of the same gestures by the same people during training as it does during cross-validation.

3.

FLASHCARD QUESTION

Front

Back

In step 6, the test data is being normalized using the test means. This means that test points are no longer being represented using values that are meaningful relative to the training distribution.

4.

FLASHCARD QUESTION

Front

Intuitively, why are individual decision trees brittle and sensitive to individual feature values? What do random forests do that alleviates this limitation? What mechanism can be used in a random forest to come to a final decision?

Back

Individual decision trees split based on hard thresholds, so small changes in feature values can result in a completely different tree. Random forests build many different trees based on 2 subsampling different features out of the data. Random forests may come to a decision based on majority voting for classification problems and averaging for regression problems.

5.

FLASHCARD QUESTION

Front

Why can this function not be used as an activation function in a multilayer perceptron neural network (hint: think about how we have to incorporate activation functions when calculating gradient descent)?

Back

This function cannot be theoretically be used within gradient descent because it is not differentiable, and cannot practically be used because its derivative at all points where x= 0 is equal to 0

6.

FLASHCARD QUESTION

Front

What function h(x) can be used as an activation in a neural network, but has similar properties as the step function (e.g., bounds on h(x)) as x → ±∞?

Back

The sigmoid function.

7.

FLASHCARD QUESTION

Front

What common activation function function that we discussed in class has a derivative that is the step function?

Back

ReLU.

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?