Structuring Machine Learning Projects
Quiz
•
Computers
•
University
•
Practice Problem
•
Hard
Trump Florence
Used 2+ times
FREE Resource
Enhance your content in a minute
23 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
This example is adapted from a real production application, but with details disguised to protect confidentiality.
You are a famous researcher in the City of Peacetopia. The people of Peacetopia have a common characteristic: they are afraid of birds. To save them, you have to build an algorithm that will detect any bird flying over Peacetopia and alert the population.
The City Council gives you a dataset of 10,000,000 images of the sky above Peacetopia, taken from the city’s security cameras. They are labeled:
y = 0: There is no bird on the image
y = 1: There is a bird on the image
Your goal is to build an algorithm able to classify new images taken by security cameras from Peacetopia.
There are a lot of decisions to make:
What is the evaluation metric?
How do you structure your data into train/dev/test sets?
Metric of success
The City Council tells you the following that they want an algorithm that
Has high accuracy.
Runs quickly and takes only a short time to classify a new image.
Can fit in a small amount of memory, so that it can run in a small processor that the city will attach to many different security cameras.
You meet with them and ask for just one evaluation metric. True/False?
True
False
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
The city revises its criteria to:
"We need an algorithm that can let us know a bird is flying over Peacetopia as accurately as possible."
"We want the trained model to take no more than 10 sec to classify a new image.”
“We want the model to fit in 10MB of memory.”
Given models with different accuracies, runtimes, and memory sizes, how would you choose one?
Create one metric by combining the three metrics and choose the best performing model
Find the subset of models that meet the runtime and memory criteria. Then, choose the highest accuracy
Take the model with the smallest runtime because that will provide the most overhead to increase accuracy
Accuracy is an optimizing metric, therefore the most accurate model is the best choice
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Based on the city’s requests, which of the following would you say is true?
Accuracy is an optimizing metric; running time and memory size are satisfying metrics
Accuracy, running time and memory size are all optimizing metrics because you want to do well on all three
Accuracy, running time and memory size are all satisfying metrics because you have to do sufficiently well on all three for your system to be acceptable
Accuracy is a satisfying metric; running time and memory size are an optimizing metric
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Structuring your data
Before implementing your algorithm, you need to split your data into train/dev/test sets. Which of these do you think is the best choice?
Train: 3,333,334
Dev: 3,333,334
Test 3,333,334
Train: 6,000,000
Dev: 1,000,000
Test: 3,000,000
Train: 6,000,000
Dev: 3,000,000
Test: 1,000,000
Train: 9,500,000
Dev: 250,000
Test: 250,000
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
After setting up your train/dev/test sets, the City Council comes across another 1,000,000 images, called the “citizens’ data”. Apparently the citizens of Peacetopia are so scared of birds that they volunteered to take pictures of the sky and label them, thus contributing these additional 1,000,000 images. These images are different from the distribution of images the City Council had originally given you, but you think it could help your algorithm.
Notice that adding this additional data to the training set will make the distribution of the training set different from the distributions of the dev and test sets.
Is the following statement true or false?
"You should not add the citizens' data to the training set, because if the training distribution is different from the dev and test sets, then this will not allow the model to perform well on the test set."
True
False
Answer explanation
Sometimes we'll need to train the model on the data that is available, and its distribution may not be the same as the data that will occur in production. Also, adding training data that differs from the dev set may still help the model improve performance on the dev set. What matters is that the dev and test set have the same distribution.
6.
MULTIPLE SELECT QUESTION
45 sec • 1 pt
One member of the City Council knows a little about machine learning, and thinks you should add the 1,000,000 citizens’ data images to the test set. You object because:
This would cause the dev and test set distributions to become different. This is a bad idea because you're not aiming where you want to hit
The test set no longer reflects the distribution of data (security cameras) you most care about
The 1,000,000 citizens' data images do not have a consistent x-->y mapping as the rest of the data
A bigger test set will slow down the speed of iterating because of the computational expense of evaluating models on the test set
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
If your goal is to have “human-level performance” be a proxy (or estimate) for Bayes error, how would you define “human-level performance”?
The performance of the head of the City Council
The best performance of a specialist (ornithologist) or possibly a group of specialists
The performance of the average citizen of Peacetopia
The performance of their volunteer amateur ornithologists
Create a free account and access millions of resources
Create resources
Host any resource
Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever
or continue with

Microsoft
%20(1).png)
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?
Similar Resources on Wayground
20 questions
Software Engineering
Quiz
•
University
18 questions
Fiber Optic
Quiz
•
University
20 questions
ICT Short Quiz
Quiz
•
University
20 questions
Review Quiz (Chapter 3&4)
Quiz
•
University
20 questions
C Programming Unit-1 Test-2
Quiz
•
University
20 questions
Microprocessor
Quiz
•
University
20 questions
Java Quiz 1
Quiz
•
University
20 questions
ข้อสอบปลายภาค จริยธรรมและกฏหมายคอมพิวเตอร์
Quiz
•
University
Popular Resources on Wayground
10 questions
Honoring the Significance of Veterans Day
Interactive video
•
6th - 10th Grade
9 questions
FOREST Community of Caring
Lesson
•
1st - 5th Grade
10 questions
Exploring Veterans Day: Facts and Celebrations for Kids
Interactive video
•
6th - 10th Grade
19 questions
Veterans Day
Quiz
•
5th Grade
14 questions
General Technology Use Quiz
Quiz
•
8th Grade
25 questions
Multiplication Facts
Quiz
•
5th Grade
15 questions
Circuits, Light Energy, and Forces
Quiz
•
5th Grade
19 questions
Thanksgiving Trivia
Quiz
•
6th Grade
Discover more resources for Computers
20 questions
Definite and Indefinite Articles in Spanish (Avancemos)
Quiz
•
8th Grade - University
7 questions
Force and Motion
Interactive video
•
4th Grade - University
9 questions
Principles of the United States Constitution
Interactive video
•
University
18 questions
Realidades 2 2A reflexivos
Quiz
•
7th Grade - University
10 questions
Dichotomous Key
Quiz
•
KG - University
25 questions
Integer Operations
Quiz
•
KG - University
7 questions
What Is Narrative Writing?
Interactive video
•
4th Grade - University
20 questions
SER vs ESTAR
Quiz
•
7th Grade - University
