Structuring Machine Learning Projects

Quiz
•
Computers
•
University
•
Hard
Trump Florence
Used 2+ times
FREE Resource
23 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
This example is adapted from a real production application, but with details disguised to protect confidentiality.
You are a famous researcher in the City of Peacetopia. The people of Peacetopia have a common characteristic: they are afraid of birds. To save them, you have to build an algorithm that will detect any bird flying over Peacetopia and alert the population.
The City Council gives you a dataset of 10,000,000 images of the sky above Peacetopia, taken from the city’s security cameras. They are labeled:
y = 0: There is no bird on the image
y = 1: There is a bird on the image
Your goal is to build an algorithm able to classify new images taken by security cameras from Peacetopia.
There are a lot of decisions to make:
What is the evaluation metric?
How do you structure your data into train/dev/test sets?
Metric of success
The City Council tells you the following that they want an algorithm that
Has high accuracy.
Runs quickly and takes only a short time to classify a new image.
Can fit in a small amount of memory, so that it can run in a small processor that the city will attach to many different security cameras.
You meet with them and ask for just one evaluation metric. True/False?
True
False
2.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
The city revises its criteria to:
"We need an algorithm that can let us know a bird is flying over Peacetopia as accurately as possible."
"We want the trained model to take no more than 10 sec to classify a new image.”
“We want the model to fit in 10MB of memory.”
Given models with different accuracies, runtimes, and memory sizes, how would you choose one?
Create one metric by combining the three metrics and choose the best performing model
Find the subset of models that meet the runtime and memory criteria. Then, choose the highest accuracy
Take the model with the smallest runtime because that will provide the most overhead to increase accuracy
Accuracy is an optimizing metric, therefore the most accurate model is the best choice
3.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Based on the city’s requests, which of the following would you say is true?
Accuracy is an optimizing metric; running time and memory size are satisfying metrics
Accuracy, running time and memory size are all optimizing metrics because you want to do well on all three
Accuracy, running time and memory size are all satisfying metrics because you have to do sufficiently well on all three for your system to be acceptable
Accuracy is a satisfying metric; running time and memory size are an optimizing metric
4.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
Structuring your data
Before implementing your algorithm, you need to split your data into train/dev/test sets. Which of these do you think is the best choice?
Train: 3,333,334
Dev: 3,333,334
Test 3,333,334
Train: 6,000,000
Dev: 1,000,000
Test: 3,000,000
Train: 6,000,000
Dev: 3,000,000
Test: 1,000,000
Train: 9,500,000
Dev: 250,000
Test: 250,000
5.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
After setting up your train/dev/test sets, the City Council comes across another 1,000,000 images, called the “citizens’ data”. Apparently the citizens of Peacetopia are so scared of birds that they volunteered to take pictures of the sky and label them, thus contributing these additional 1,000,000 images. These images are different from the distribution of images the City Council had originally given you, but you think it could help your algorithm.
Notice that adding this additional data to the training set will make the distribution of the training set different from the distributions of the dev and test sets.
Is the following statement true or false?
"You should not add the citizens' data to the training set, because if the training distribution is different from the dev and test sets, then this will not allow the model to perform well on the test set."
True
False
Answer explanation
Sometimes we'll need to train the model on the data that is available, and its distribution may not be the same as the data that will occur in production. Also, adding training data that differs from the dev set may still help the model improve performance on the dev set. What matters is that the dev and test set have the same distribution.
6.
MULTIPLE SELECT QUESTION
45 sec • 1 pt
One member of the City Council knows a little about machine learning, and thinks you should add the 1,000,000 citizens’ data images to the test set. You object because:
This would cause the dev and test set distributions to become different. This is a bad idea because you're not aiming where you want to hit
The test set no longer reflects the distribution of data (security cameras) you most care about
The 1,000,000 citizens' data images do not have a consistent x-->y mapping as the rest of the data
A bigger test set will slow down the speed of iterating because of the computational expense of evaluating models on the test set
7.
MULTIPLE CHOICE QUESTION
30 sec • 1 pt
If your goal is to have “human-level performance” be a proxy (or estimate) for Bayes error, how would you define “human-level performance”?
The performance of the head of the City Council
The best performance of a specialist (ornithologist) or possibly a group of specialists
The performance of the average citizen of Peacetopia
The performance of their volunteer amateur ornithologists
Create a free account and access millions of resources
Similar Resources on Wayground
20 questions
Linux File Permissions Quiz

Quiz
•
University
20 questions
Software Engineering Quiz 1

Quiz
•
University
18 questions
Model Tuning Quiz

Quiz
•
University
19 questions
CCNA Prep -- Cabling

Quiz
•
12th Grade - University
19 questions
Modules 1.1-1.2 Questions

Quiz
•
12th Grade - University
20 questions
DevOps

Quiz
•
University
20 questions
Data Compression - Part 3

Quiz
•
University
20 questions
Arduino and ESP32 Quiz

Quiz
•
University
Popular Resources on Wayground
10 questions
Lab Safety Procedures and Guidelines

Interactive video
•
6th - 10th Grade
10 questions
Nouns, nouns, nouns

Quiz
•
3rd Grade
10 questions
9/11 Experience and Reflections

Interactive video
•
10th - 12th Grade
25 questions
Multiplication Facts

Quiz
•
5th Grade
11 questions
All about me

Quiz
•
Professional Development
22 questions
Adding Integers

Quiz
•
6th Grade
15 questions
Subtracting Integers

Quiz
•
7th Grade
9 questions
Tips & Tricks

Lesson
•
6th - 8th Grade
Discover more resources for Computers
21 questions
Spanish-Speaking Countries

Quiz
•
6th Grade - University
20 questions
Levels of Measurements

Quiz
•
11th Grade - University
7 questions
Common and Proper Nouns

Interactive video
•
4th Grade - University
12 questions
Los numeros en español.

Lesson
•
6th Grade - University
7 questions
PC: Unit 1 Quiz Review

Quiz
•
11th Grade - University
7 questions
Supporting the Main Idea –Informational

Interactive video
•
4th Grade - University
12 questions
Hurricane or Tornado

Quiz
•
3rd Grade - University
7 questions
Enzymes (Updated)

Interactive video
•
11th Grade - University