
Exam Questions
Authored by Wayground Content
Mathematics
KG

AI Actions
Add similar questions
Adjust reading levels
Convert to real-world scenario
Translate activity
More...
Content View
Student View
16 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Intuitively, why are individual decision trees brittle and sensitive to individual feature values? What do random forests do that alleviates this limitation? What mechanism can be used in a random forest to come to a final decision?
Individual decision trees are robust and not sensitive to feature values, while random forests use a single tree for decision making.
Random forests use majority voting for classification and averaging for regression, which helps in making more stable decisions.
Individual decision trees are complex and require many features, while random forests use only one feature to make decisions.
Random forests rely on a single decision tree to make predictions, which is less sensitive to feature changes.
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What common activation function that we discussed in class has a derivative that is the step function?
ReLU
Sigmoid
Tanh
Softmax
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
The perceptron algorithm What are the hyperparameters of the perceptron algorithm? From your experience, which of those hyperparameters have an effect over its accuracy? Explain!
The number of epochs and learning rate
The number of hidden layers and activation function
The batch size and dropout rate
The optimizer and regularization strength
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What is the issue in performing PCA separately on the training set and the test set? In code this might look like the pseudo-code shown below, where we have chosen to map the data into a ten dimensional space: X_train_pca = PCA(n_components=10).fit_transform(X_train) X_test_pca = PCA(n_components=10).fit_transform(X_test) # the fit_transform method of a PCA object computes the # principal components and transforms the given feature matrix # into the space of the principal components. # Assume that X_train is the matrix containing the training set # and X_test is the matrix containing the test set What is the correct way of doing this?
Training PCA separately on the training and test set creates incompatible feature representations.
Training PCA on the entire dataset is OK since it does not make use of the labels, so creates no leak of label information.
PCA should only be applied to the test set to avoid data leakage.
PCA can be performed on the training set only, and the test set should be ignored.
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
What do SVMs aim for in terms of the separating hyperplane?
The steepest slope of the hyperplane
The maximum margin from the closest points
The minimum distance from all training points
The average slope of the hyperplane
6.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
My lab is collaborating with a biologist who studies the microbiome of both living and dead animals and humans. They sampled the microbiomes of dead mice and humans and we helped them develop a random forest-based approach to predict the time since death based on the microbiome composition. This has potential forensic applications since standard forensic techniques work for a limited range of time. To generate the data, each body was sampled at regular intervals and its microbiome composition determined by appropriate experimental protocols. In order to evaluate the ability of a classifier to predict time-since-death, they performed leave-one-cadaver out cross validation, where a classifier was trained on all measurements performed on all but one cadaver. The classifier was evaluated on all the measurements performed on the left-out cadaver. This is iterated until obtaining predictions on all cadavers. Explain the value of this evaluation procedure over a procedure that pools all the samples from all the cadavers, i.e. mixes them all up and then performs cross-validation over the pooled samples.
Leave-one-cadaver-out is better because it allows the system to work on cadavers that it has not seen during training.
Pooling samples provides a larger dataset, which improves the classifier's performance.
Leave-one-cadaver-out is less computationally intensive than pooling samples.
Pooling samples allows for a more generalized model that can predict across different cadavers.
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Why can this function not be used as an activation function in a multilayer perceptron neural network (hint: think about how we have to incorporate activation functions when calculating gradient descent)?
It is not differentiable, and its derivative at all points where x= 0 is equal to 0
It is too complex to compute during backpropagation
It produces outputs that are not bounded between 0 and 1
It requires too much computational power to evaluate
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?