Structuring Machine Learning Projects
Quiz
•
Computers
•
University
•
Practice Problem
•
Hard
Trump Florence
Used 2+ times
FREE Resource
Enhance your content in a minute
23 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
This example is adapted from a real production application, but with details disguised to protect confidentiality.
You are a famous researcher in the City of Peacetopia. The people of Peacetopia have a common characteristic: they are afraid of birds. To save them, you have to build an algorithm that will detect any bird flying over Peacetopia and alert the population.
The City Council gives you a dataset of 10,000,000 images of the sky above Peacetopia, taken from the city’s security cameras. They are labeled:
y = 0: There is no bird on the image
y = 1: There is a bird on the image
Your goal is to build an algorithm able to classify new images taken by security cameras from Peacetopia.
There are a lot of decisions to make:
What is the evaluation metric?
How do you structure your data into train/dev/test sets?
Metric of success
The City Council tells you the following that they want an algorithm that
Has high accuracy.
Runs quickly and takes only a short time to classify a new image.
Can fit in a small amount of memory, so that it can run in a small processor that the city will attach to many different security cameras.
You meet with them and ask for just one evaluation metric. True/False?
True
False
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
The city revises its criteria to:
"We need an algorithm that can let us know a bird is flying over Peacetopia as accurately as possible."
"We want the trained model to take no more than 10 sec to classify a new image.”
“We want the model to fit in 10MB of memory.”
Given models with different accuracies, runtimes, and memory sizes, how would you choose one?
Create one metric by combining the three metrics and choose the best performing model
Find the subset of models that meet the runtime and memory criteria. Then, choose the highest accuracy
Take the model with the smallest runtime because that will provide the most overhead to increase accuracy
Accuracy is an optimizing metric, therefore the most accurate model is the best choice
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Based on the city’s requests, which of the following would you say is true?
Accuracy is an optimizing metric; running time and memory size are satisfying metrics
Accuracy, running time and memory size are all optimizing metrics because you want to do well on all three
Accuracy, running time and memory size are all satisfying metrics because you have to do sufficiently well on all three for your system to be acceptable
Accuracy is a satisfying metric; running time and memory size are an optimizing metric
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Structuring your data
Before implementing your algorithm, you need to split your data into train/dev/test sets. Which of these do you think is the best choice?
Train: 3,333,334
Dev: 3,333,334
Test 3,333,334
Train: 6,000,000
Dev: 1,000,000
Test: 3,000,000
Train: 6,000,000
Dev: 3,000,000
Test: 1,000,000
Train: 9,500,000
Dev: 250,000
Test: 250,000
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
After setting up your train/dev/test sets, the City Council comes across another 1,000,000 images, called the “citizens’ data”. Apparently the citizens of Peacetopia are so scared of birds that they volunteered to take pictures of the sky and label them, thus contributing these additional 1,000,000 images. These images are different from the distribution of images the City Council had originally given you, but you think it could help your algorithm.
Notice that adding this additional data to the training set will make the distribution of the training set different from the distributions of the dev and test sets.
Is the following statement true or false?
"You should not add the citizens' data to the training set, because if the training distribution is different from the dev and test sets, then this will not allow the model to perform well on the test set."
True
False
Answer explanation
Sometimes we'll need to train the model on the data that is available, and its distribution may not be the same as the data that will occur in production. Also, adding training data that differs from the dev set may still help the model improve performance on the dev set. What matters is that the dev and test set have the same distribution.
6.
MULTIPLE SELECT QUESTION
45 sec • 1 pt
One member of the City Council knows a little about machine learning, and thinks you should add the 1,000,000 citizens’ data images to the test set. You object because:
This would cause the dev and test set distributions to become different. This is a bad idea because you're not aiming where you want to hit
The test set no longer reflects the distribution of data (security cameras) you most care about
The 1,000,000 citizens' data images do not have a consistent x-->y mapping as the rest of the data
A bigger test set will slow down the speed of iterating because of the computational expense of evaluating models on the test set
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
If your goal is to have “human-level performance” be a proxy (or estimate) for Bayes error, how would you define “human-level performance”?
The performance of the head of the City Council
The best performance of a specialist (ornithologist) or possibly a group of specialists
The performance of the average citizen of Peacetopia
The performance of their volunteer amateur ornithologists
Access all questions and much more by creating a free account
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
Already have an account?
Similar Resources on Wayground
20 questions
DBMS Quiz
Quiz
•
University
20 questions
Fotogrametri dan Penginderaan Jauh
Quiz
•
University
20 questions
Quiz Harian Gdevelop MPK
Quiz
•
University
20 questions
Fundamentos de Sistemas de informação
Quiz
•
University
20 questions
Mid Uts IMK
Quiz
•
University
20 questions
Grade 13 Quiz on OS and Networks
Quiz
•
University
20 questions
Digital and Analog transmission
Quiz
•
University
20 questions
Thiết kế trình chiếu PowerPoint (41-60)
Quiz
•
University
Popular Resources on Wayground
15 questions
Fractions on a Number Line
Quiz
•
3rd Grade
20 questions
Equivalent Fractions
Quiz
•
3rd Grade
25 questions
Multiplication Facts
Quiz
•
5th Grade
22 questions
fractions
Quiz
•
3rd Grade
20 questions
Main Idea and Details
Quiz
•
5th Grade
20 questions
Context Clues
Quiz
•
6th Grade
15 questions
Equivalent Fractions
Quiz
•
4th Grade
20 questions
Figurative Language Review
Quiz
•
6th Grade
