Structuring Machine Learning Projects

Structuring Machine Learning Projects

University

23 Qs

quiz-placeholder

Similar activities

Coding in Scratch

Coding in Scratch

4th Grade - University

22 Qs

technology (Computer)

technology (Computer)

3rd Grade - University

20 Qs

Multimedia

Multimedia

12th Grade - University

20 Qs

Modul 6 - Komputer I

Modul 6 - Komputer I

University

20 Qs

Stats+Python ISA Test - Quiz 2

Stats+Python ISA Test - Quiz 2

University - Professional Development

20 Qs

DevOps

DevOps

University

20 Qs

14. 204.3 Logical Volume Manager

14. 204.3 Logical Volume Manager

University

21 Qs

IT Infrastructure

IT Infrastructure

University

18 Qs

Structuring Machine Learning Projects

Structuring Machine Learning Projects

Assessment

Quiz

Computers

University

Hard

Created by

Trump Florence

Used 2+ times

FREE Resource

23 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Media Image

This example is adapted from a real production application, but with details disguised to protect confidentiality.

You are a famous researcher in the City of Peacetopia. The people of Peacetopia have a common characteristic: they are afraid of birds. To save them, you have to build an algorithm that will detect any bird flying over Peacetopia and alert the population.

The City Council gives you a dataset of 10,000,000 images of the sky above Peacetopia, taken from the city’s security cameras. They are labeled:

  • y = 0: There is no bird on the image

  • y = 1: There is a bird on the image

  • Your goal is to build an algorithm able to classify new images taken by security cameras from Peacetopia.

    There are a lot of decisions to make:

    • What is the evaluation metric?

    • How do you structure your data into train/dev/test sets?

    Metric of success

    The City Council tells you the following that they want an algorithm that

    1. Has high accuracy.

    2. Runs quickly and takes only a short time to classify a new image.

    3. Can fit in a small amount of memory, so that it can run in a small processor that the city will attach to many different security cameras.

    You meet with them and ask for just one evaluation metric. True/False?

True

False

2.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

The city revises its criteria to:

  • "We need an algorithm that can let us know a bird is flying over Peacetopia as accurately as possible."

  • "We want the trained model to take no more than 10 sec to classify a new image.”

  • “We want the model to fit in 10MB of memory.”

Given models with different accuracies, runtimes, and memory sizes, how would you choose one?

Create one metric by combining the three metrics and choose the best performing model

Find the subset of models that meet the runtime and memory criteria. Then, choose the highest accuracy

Take the model with the smallest runtime because that will provide the most overhead to increase accuracy

Accuracy is an optimizing metric, therefore the most accurate model is the best choice

3.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Based on the city’s requests, which of the following would you say is true?

Accuracy is an optimizing metric; running time and memory size are satisfying metrics

Accuracy, running time and memory size are all optimizing metrics because you want to do well on all three

Accuracy, running time and memory size are all satisfying metrics because you have to do sufficiently well on all three for your system to be acceptable

Accuracy is a satisfying metric; running time and memory size are an optimizing metric

4.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

Structuring your data

Before implementing your algorithm, you need to split your data into train/dev/test sets. Which of these do you think is the best choice?

Train: 3,333,334

Dev: 3,333,334

Test 3,333,334

Train: 6,000,000

Dev: 1,000,000

Test: 3,000,000

Train: 6,000,000

Dev: 3,000,000

Test: 1,000,000

Train: 9,500,000

Dev: 250,000

Test: 250,000

5.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

After setting up your train/dev/test sets, the City Council comes across another 1,000,000 images, called the “citizens’ data”. Apparently the citizens of Peacetopia are so scared of birds that they volunteered to take pictures of the sky and label them, thus contributing these additional 1,000,000 images. These images are different from the distribution of images the City Council had originally given you, but you think it could help your algorithm.

Notice that adding this additional data to the training set will make the distribution of the training set different from the distributions of the dev and test sets.

Is the following statement true or false?

"You should not add the citizens' data to the training set, because if the training distribution is different from the dev and test sets, then this will not allow the model to perform well on the test set."

True

False

Answer explanation

Sometimes we'll need to train the model on the data that is available, and its distribution may not be the same as the data that will occur in production. Also, adding training data that differs from the dev set may still help the model improve performance on the dev set. What matters is that the dev and test set have the same distribution.

6.

MULTIPLE SELECT QUESTION

45 sec • 1 pt

One member of the City Council knows a little about machine learning, and thinks you should add the 1,000,000 citizens’ data images to the test set. You object because:

This would cause the dev and test set distributions to become different. This is a bad idea because you're not aiming where you want to hit

The test set no longer reflects the distribution of data (security cameras) you most care about

The 1,000,000 citizens' data images do not have a consistent x-->y mapping as the rest of the data

A bigger test set will slow down the speed of iterating because of the computational expense of evaluating models on the test set

7.

MULTIPLE CHOICE QUESTION

30 sec • 1 pt

If your goal is to have “human-level performance” be a proxy (or estimate) for Bayes error, how would you define “human-level performance”?

The performance of the head of the City Council

The best performance of a specialist (ornithologist) or possibly a group of specialists

The performance of the average citizen of Peacetopia

The performance of their volunteer amateur ornithologists

Create a free account and access millions of resources

Create resources
Host any resource
Get auto-graded reports
or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?