Which of the following statements is true about relational databases? Options: Relational databases work well with big data. Duplicate data is not allowed in a relational database. Each row in a relational database has a unique identifier. Relational databases primarily use partitions and distributed data storage.

Each row in a relational database has a unique identifier.

When would a NoSQL database be a better choice than a relational database?

For a large shopping site like Amazon

Which of the following best describes why many images are needed in a machine learning data set? Options: Many images are not actually necessary for a machine learning data set, Using many images will improve the accuracy of the model’s prediction, Using many images will always eliminate algorithmic bias, Using many images will make the model use less energy

Using many images will improve the accuracy of the model’s prediction

Which of the following statements about machine learning is true? A machine learning model requires the programmer to tell the computer which features to look for, A machine learning model can only be made one time, and cannot be updated as new data comes in, A machine learning model can free up workers to spend time on more complex tasks, All of these statements are true

A machine learning model can free up workers to spend time on more complex tasks

Which of the following is the best example of underfit? Blueberries, raspberries, and cherries being labeled separately instead of all being labeled as “fruit”, Frogs and dogs being mislabeled as the same thing because they both end in “ogs”, A dress being mislabeled as a skirt, Carrots, oranges, and basketballs all being labeled as a pumpkin

Blueberries, raspberries, and cherries being labeled separately instead of all being labeled as “fruit”

Which of the following is true about why data scientists use data visualizations? Data visualizations are useful for data scientists while analyzing data, but not in the final storytelling, Data visualizations are effective, but only for a small data set, Data visualizations help people understand patterns in data, All of these statements are true.

Data visualizations help people understand patterns in data

What is a data scientist’s primary objective?

To help others better understand real-world phenomena found in data

Which of the following activities is NOT typically performed by a data scientist? Cleaning data, Creating visualizations, Maintain computer infrastructure, such as the cloud, Communicating with clients and storytelling

Maintain computer infrastructure, such as the cloud

Which of the following is the main cause of machine learning algorithmic bias? Options: The machine learning data set unintentionally contains bias that is already present in the community, A company purposefully wants to discriminate against a group of people, Computers that can think for themselves will actively try to harm humans, Machine learning algorithms are immune to bias

The machine learning data set unintentionally contains bias that is already present in the community

How confident do you feel about this topic?

Very confident, Mostly confident, Somewhat confident, Not confident at all

CSF 7.8 Module 7 Test Review

Flashcard

•

Computers

•

9th - 12th Grade

•

Practice Problem

•

Hard

Wayground Content

FREE Resource

Student preview

22 questions

Show all answers

FLASHCARD QUESTION

Front

Which image best portrays your current mood?

Back

undefined

FLASHCARD QUESTION

Front

Which of the following can be introduced accidentally during data selection? Bias, Ambiguity, Storytelling, Correlation

Back

Bias

Answer explanation

Bias can be introduced accidentally during data selection if the selection process is not representative of the entire population or if certain factors are weighted more heavily than others. This can lead to results that are skewed and not reflective of reality.
Storytelling refers to the way data is presented or interpreted, and it can also be influenced by personal biases or preconceived notions. This can lead to a distorted view of the data and inaccurate conclusions.
Correlation can be introduced accidentally during data selection if variables that are not truly related are included in the analysis. This can lead to spurious correlations and inaccurate results.
Ambiguity, on the other hand, is not typically introduced accidentally during data selection. Ambiguity refers to the presence of multiple interpretations or unclear meaning in the data, and it is more likely to arise from issues in data collection or measurement.

FLASHCARD QUESTION

Front

Which of the following tasks would make data storytelling less accurate?
- Be clear and concise
- Address a specific customer’s problem
- Provide only one viewpoint
- Provide background information about the problem being addressed

Back

Provide only one viewpoint

Answer explanation

Providing only one viewpoint would make data storytelling less accurate.
While being clear and concise, addressing a specific customer's problem, and providing background information about the problem being addressed can all improve the accuracy of data storytelling, presenting only one viewpoint can skew the story and limit the audience's understanding of the full picture.
It's important to present a balanced view of the data, including any conflicting or alternative viewpoints, so that the audience can make informed decisions based on the information presented.

FLASHCARD QUESTION

Front

How can you determine if a data set contains outliers by looking at a boxplot?

Back

The outliers will be marked by a circle beyond the whiskers on a boxplot.

Answer explanation

The outliers will be marked by a circle beyond the whiskers on a boxplot.
A boxplot is a graphical representation of the distribution of a dataset that provides information about the median, quartiles, and range of the data. It also indicates the presence of any outliers in the data.
Outliers are marked as individual points beyond the whiskers on a boxplot, which are lines that extend from the box and represent the range of the data. The whiskers typically extend to 1.5 times the interquartile range (IQR), which is the distance between the first and third quartiles of the data. Any data points that fall beyond the whiskers are considered outliers and are marked as individual circles on the plot.
Therefore, to determine if a data set contains outliers by looking at a boxplot, you can visually inspect the plot and look for any circles beyond the whiskers.

FLASHCARD QUESTION

Front

Which of the following summary statistics can help determine the center of the data? Median, Lowest value, Highest value, Correlation coefficient

Back

Median

Answer explanation

The median can help determine the center of the data.
The median is a measure of central tendency that represents the middle value in a dataset when the values are arranged in order. Half of the data points are greater than the median, and half are less than the median. Therefore, the median is a useful summary statistic for determining the center of a dataset.
The lowest and highest values are not measures of central tendency and do not provide information about the center of the data.
Correlation coefficient is a measure of the strength and direction of the linear relationship between two variables and is not related to determining the center of the data.

FLASHCARD QUESTION

Front

Which of the following data errors would affect the accuracy of a predictive model? Duplicate entries, Misspellings, Missing values, All of these options

Back

All of these options

Answer explanation

All of these options would affect the accuracy of a predictive model.
Missing values, misspellings, and duplicate entries are all common types of data errors that can have a significant impact on the accuracy of predictive models.
Missing values can introduce bias and reduce the precision of the model, while misspellings can lead to incorrect matches and inaccurate predictions. Duplicate entries can also cause problems, as they may be counted multiple times and skew the results.
To build an accurate predictive model, it is important to ensure that the data is clean, complete, and error-free. This includes checking for missing values, correcting misspellings, and removing duplicate entries before training the model.

FLASHCARD QUESTION

Front

Which of the following is true?
Options: Data cleaning is the very last thing a data scientist does with data, Data cleaning is not necessary for most data sets, Data cleaning has to be done by hand, Data cleaning takes a lot of time but can be made faster by using a programming language

Back

Data cleaning takes a lot of time but can be made faster by using a programming language

Answer explanation

Data cleaning takes a lot of time but can be made faster by using a programming language.
Data cleaning, also known as data preprocessing, is an essential step in the data analysis process that involves identifying and correcting errors, inconsistencies, and missing values in a dataset. Data cleaning is typically done using a programming language, as it can be time-consuming and repetitive to do manually.
While some aspects of data cleaning, such as identifying errors or missing values, may require manual intervention, most data cleaning tasks can be automated using programming languages and libraries such as Python and R.
Data cleaning is a crucial step in preparing data for analysis and modeling, and it should be done before any other analysis is performed.

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever

or continue with

Microsoft

Apple

Others

Already have an account?

Similar Resources on Wayground

20 questions

order of adjectives

Flashcard

•

10th Grade

15 questions

7517 01 Programming Basics

Flashcard

•

10th Grade - University

20 questions

Imparfait

Flashcard

•

9th - 12th Grade

12 questions

Organic 2 Exam 4 Review

Flashcard

•

11 questions

7517 05 Structured Programming

Flashcard

•

10th Grade - University

12 questions

Descriptive adjectives for Home and Furniture

Flashcard

•

10th Grade - University

14 questions

Identifying Types of Phrases

Flashcard

•

7th - 12th Grade

15 questions

English Grammar Flashcard

Flashcard

•

7th Grade - University

Popular Resources on Wayground

15 questions

Fractions on a Number Line

Quiz

•

3rd Grade

10 questions

Probability Practice

Quiz

•

4th Grade

15 questions

Probability on Number LIne

Quiz

•

4th Grade

20 questions

Equivalent Fractions

Quiz

•

3rd Grade

25 questions

Multiplication Facts

Quiz

•

5th Grade

$fractions$

22 questions

fractions

Quiz

•

3rd Grade

6 questions

Appropriate Chromebook Usage

Lesson

•

7th Grade

10 questions

Greek Bases tele and phon

Quiz

•

6th - 8th Grade

Discover more resources for Computers

10 questions

Exploring Digital Citizenship Essentials

Interactive video

•

6th - 10th Grade

14 questions

[AP CSP] JavaScript Programming Quiz

Quiz

•

9th - 12th Grade

10 questions

Understanding Computers and Computer Engineering

Interactive video

•

7th - 12th Grade

37 questions

Python - Tuples, Lists, and List Methods

Quiz

•

9th - 12th Grade

60 questions

MOS Word Home, Insert, Reference Ribbon Basics

Quiz

•

9th Grade