Search Header Logo
Chapter 5 [5.3]

Chapter 5 [5.3]

Assessment

Presentation

Mathematics

University

Medium

CCSS
HSS.CP.B.6, HSS.CP.A.3, HSS.CP.A.5

Standards-aligned

Created by

Amanda Phillips

Used 2+ times

FREE Resource

18 Slides • 7 Questions

1

Chapter 5

[5.3]

STAT 109 MSU SPRING 2022​

2

5.3 Associations in Categorical Variables

In some circumstances, we wish to know the probability of an event among differing groups in a sample or population. For example, the probability that a high school graduate makes over $100k vs. the probability that a college graduate makes over $100k.

Studying probabilities in this way can help us to investigate whether two variables are associated or independent. In this example, does a person's level of education change their probability of achieving a 6-figure salary?​

To calculate these probabilities, we impose conditions on the event.​

5.3 Associations in Categorical Variables

3

Conditional Probabilities

- The condition we impose on a probability calculation ​is an event which we assume to have occurred successfully, a characteristic we assume to be valid. We assume in our calculation that this condition is met.

- The probability P(A|B) denotes the probability that event A occurs given that event B has occurred (or given that condition B is met)​.

5.3 Associations in Categorical Variables

4

The only way to ensure that a condition is met is to exclude from the sample all outcomes for which the condition is not met. This fundamentally changes the formula we use to perform the calculation.​

5.3 Associations in Categorical Variables

​|

5

Multiple Choice

Question image

Consider the set of shapes are mixed up and you select one at random. What is the probability that you select a circle, given that it is red?

P(circle | red) = ?

1

4/12

2

2/12

3

2/7

4

2/5

6

Multiple Choice

Question image

Consider the set of shapes are mixed up and you select one at random. What is the probability that you select a red shape, given that it is a circle?

P(red | circle) = ?

1

5/12

2

4/12

3

2/5

4

2/4

7

In those cases where we do not know what the sample of data looks like, we can use the following formula to calculate conditional probabilities.​

5.3 Associations in Categorical Variables

​|

You would use this formula in problems where a set of probabilities are given to you, and you cannot calculate them yourself.

8

5.3 Associations in Categorical Variables

media

9

Independent and Associated Variables

Consider two categorical variables and a relevant characteristic of each. If the probability that one characteristic appears in a general population is significantly different than the probability that the characteristic appears in a population having the second characteristic, then the two categorical variables are associated (not necessarily by cause and effect).

​For example, if women are less likely to be left-handed than all people are to be left handed, then a person's dominant hand is associated with their gender.

5.3 Associations in Categorical Variables

10

Consider a normal deck of playing cards (with 52 cards, half of which are red and half

which are black, with 4 suits...). ​Are the categorical variables color and suit associated or

independent?

​​

P(red) = 26/52 = 1/2​

​ P(diamond) = 13/52 = 1/4

​ P(diamond | red) = 13/26 = 1/2​

​ P(red | diamond) = 13/13 = 1

​ These categorical variables are​

associated.​

5.3 Associations in Categorical Variables

media

11

Multiple Choice

Are the categorical variables "royalty" (whether a card is a face card or not) and color associated or independent?

Hint: What are P(face card) and P(face card | red)?

1

Associated

2

Independent

12

Multiple Choice

Imagine you randomly select one card from a normal deck. Are the events "drawing a 9" and "drawing a red card" independent or associated.

Hint: What are P(9) and P(9 | red)?

1

Associated

2

Independent

13

Multiple Choice

According to the 2010 U.S. Census, the rate of incarceration among non-latinx white people is 0.0045 and the rate of incarceration among black people is 0.023.

What does this tell you about the categorical variables "race" and "incarceration"?

1

Race and incarceration are associated variables because black people are more likely to commit crimes than white people.

2

Race and incarceration are associated variables because black people are disproportionately affected by an inequitable justice system.

3

Race and incarceration are associated variables because the likelihood that a randomly selected black person has been incarcerated is significantly different than the likelihood that a randomly selected white person has been incarcerated.

4

Race and incarceration are independent variables. Your race has nothing to do with whether or not you commit crimes or whether or not you are arrested.

14

Events in Sequence

Sometimes a probability experiment is a sequence of two or more probability experiments in sequence. For example, flipping a coin twice, or flipping a coin and then rolling a die.

​- The sample space now includes all combinations of an outcome from the first sample space and an outcome of the second sample space (and so on if there are more events in the sequence).

- Probability calculations can still be done using the original probability formula.

- We can use AND and OR probability formulas as well.​

5.3 Associations in Categorical Variables

15

What does the sample space look like?

You can use tables to organize the

outcomes in a probability experiment

involving two

phases.

5.3 Associations in Categorical Variables

media
media

16

What does the sample space look like?

You can use trees to represent the outcomes in a probability experiment involving more than two phases, like flipping a coin three times.

5.3 Associations in Categorical Variables

​H

T​

​H

T​

​H

T​

​H​​

T​

​H​​

T​

​H​​

T​

​H​​

T​

​= HHH

= HHT

= HTH

= HTT​

​= THH

= THT

= TTH

= TTT​

17

Independent and Dependent Events

When performing a probability experiment involving several actions in sequence, we consider whether each action may have an effect on the next, and whether the resultant events can be considered independent.

Consider one such probability experiment where event A occurs and then event B occurs.

- The events are independent if P(B) = P(B|A).

​- The events are dependent if P(B) ≠ P(B|A).

5.3 Associations in Categorical Variables

18

Multiple Choice

Consider a probability experiment in which you roll a die twice. Are the events "rolling a one" and then "rolling a two" independent or dependent events?

1

Independent

2

Dependent

19

Multiple Choice

Consider a probability experiment in which you draw a marble from a bag of red and green marbles and then draw another marble without replacing the first. Are the events "red marble" and then "green marble" independent or dependent events?

1

Independent

2

Dependent

20

Probabilities of Events in Sequence: AND

In terms of events in sequence, when we say "AND" we aren't necessarily referring to a single event having two characteristics (like a person who is a man and over 40), but rather than one specified event occurs and then another specified event occurs. We still use "AND" in the notation.

For example, if you draw two marbles from a bag and wish to calculate the probability that the first marble is blue AND THEN the second marble is red. We would use P(blue and red).

​- P(A AND B) = P(A)⋅P(B|A)

- We use the multiplicative rule for "AND," but calculate the probability that the second event occurs under the assumption that the ​first event did occur.​

5.3 Associations in Categorical Variables

21

5.3 Associations in Categorical Variables

media

22

5.3 Associations in Categorical Variables

Consider a bag containing 4 red marbles and 6 blue ones. If you take two marbles from the bag, find the probability that you select a blue marble twice in a row without replacement.

23

Probabilities of Events in Sequence: OR

You most likely won't see the word "OR" used for events in sequence. Instead, you'll see "at least one". We have to think about what a question is asking to see that it may be an "OR" probability to which the additive rule applies.

For example, if you flip a coin twice, what is the probability that at least one of them is heads? This would mean we are looking for the probability that the first coin shows heads OR the second coin shows heads OR they both show heads. We still use the rule:

- P(A OR B) = P(A)+P(B)-P(A AND B)

- This is messy with dependent events, so it's best to use the sample space and the basic probability formula for those cases.

- There are additional formulas for more complicated calculations, which we'll see in Ch. 6.​

5.3 Associations in Categorical Variables

24

5.3 Associations in Categorical Variables

Suppose you flip a coin twice. What is the probability that at least one flip results in heads?

We need P(heads first or heads second)​ = P(heads first) + P(heads second) - P(heads first and heads second) = 1/2 + 1/2 - 1/4 = 3/4.

We can also use the sample space to see that 3 out of 4 outcomes have at least one heads.​

media

25

5.3 Associations in Categorical Variables

Suppose you flip a coin seven times. What is the probability that at least one flip results in heads?

In this case, the sample space is so large that it would be inefficient to determine the probability that way. And, using the formula becomes very messy - trust me. Sometimes, we use some logic.

We can calculate the probability that ​NONE of the coins flips results in heads using the "AND" rule. P(T, T, T, T, T, T, T) = (1/2)⋅(1/2)⋅​(1/2)⋅(1/2)⋅(1/2)⋅(1/2)⋅(1/2) = 1/128.

Since "at least one heads" is the mathematical opposite or complement of "all tails," the probability that we flip at least one heads is 1-1/128​

Chapter 5

[5.3]

STAT 109 MSU SPRING 2022​

Show answer

Auto Play

Slide 1 / 25

SLIDE