Search Header Logo
Chapter 10 [10.1, 10.2, 10.3]

Chapter 10 [10.1, 10.2, 10.3]

Assessment

Presentation

Mathematics

University

Hard

Created by

Amanda Phillips

Used 9+ times

FREE Resource

15 Slides • 0 Questions

1

Chapter 10

[10.1, 10.2, 10.3]

By Amanda Phillips

2

10.1 The Basic Ingredients for Testing with Categorical Variables

Consider this: How would you determine if a regular six-sided game die was fair? What would you look for if suspected that the die were not fair?

10.1 The Basic Ingredients for Testing with Categorical Variables

3

Testing a Probability Distribution

On that die, there are 6 different outcomes, which are supposed to be equally likely. If the die is not fair, then some values will come up more often than expected, and others will come up less often than expected. Then, the outcomes are not equally likely.

​Of course, we know from experience that you won't always roll exactly the same number of 1s and 2s and 3s and so on. There will be some variation. The question, then, is whether the empirical probability distribution of the die is significantly different than the theoretical probability distribution. We know that hypothesis tests can help us to answer this!

10.1 The Basic Ingredients for Testing with Categorical Variables

4

Chi-Square Tests

A Chi-Square Test is a hypothesis test that determines whether a probability distribution is significantly different than some expected or claimed distribution. We use a chi-square test statistic that falls in a chi-square distribution in this test.

Chi-Square tests help us to determine if two categorical variables are associated with one another, especially where at least one of those variables has is nonbinary. For example, perhaps we wish to know whether economic status (low, middle, or upper class) is associated with political alignment (left, left-leaning, right-leaning, or right). Notice that we couldn't simply run a proportion test, as there are 12 different proportions to consider!​

10.1 The Basic Ingredients for Testing with Categorical Variables

5

Chi-Square Tests

​The null hypothesis of the test claims that the categorical variables studied are not associated with one another. The alternative hypothesis claims that they are associated.

The test is based on comparisons between what is expected (theoretically) and what is true (observed). We organize this information in a contingency table (also called a two-way table).

NOTE: The two-way tables we studied at length when we first covered probability were contingency tables.​

A contingency table is used to summarize the observed results of a sample, which we will compare to the expected results of a sample.

10.1 The Basic Ingredients for Testing with Categorical Variables

6

Contingency Tables and Expectations

Suppose you wanted to know if someone's zodiac sign and sociality are associated. We consider 4 categories of zodiac signs (water, air, earth, and fire) against 2 social categories (introverted and extroverted).

Let's do an experiment!​

10.1 The Basic Ingredients for Testing with Categorical Variables

7

10.1 The Basic Ingredients for Testing with Categorical Variables

media

Observed results are collected from the class. Expected counts (which we will display in parentheses) are based on what should be true if the categorical variables of zodiac sign and sociality are not associated.

According to "Psychology Today​," 50-74% of the population is extroverted! Let's play it safe and assume that 62% of the population is extroverted, and that the other 38% of the population are introverted. If zodiac sign and sociality are not associated, then these proportions should apply to each of our signs as well. We use this to determine the expected counts.

8

Chi-Square Tests

10.1 The Basic Ingredients for Testing with Categorical Variables

9

Chi-Square Tests

Once we have a test statistic, we use a Chi-Square distribution to measure our p-value. We use the distribution in the same way that we use a normal distribution, measuring the p-value of obtaining a test statistic as extreme or more extreme than the one obtained, but you may notice something odd about the Chi-Square distribution!

10.1 The Basic Ingredients for Testing with Categorical Variables

media

10

Chi-Square Tests

The Chi-Square distribution has no left tail. This is because a chi-square test statistic can't be negative. So this is always a right-tailed test!

And, we have degrees of freedom similar to those used in the t-distribution. This time, the degrees of freedom is one less than the number of categories considered in the test (between both variables).

In our experiment, there are 8 categories.​

10.1 The Basic Ingredients for Testing with Categorical Variables

media

11

Chi-Square Tests

Once we have a p-value, we compare it to our significance level and determine whether to reject the null hypothesis or fail to do so.

If the p-value is less than the significance level, reject the null hypothesis: We have sufficient evidence to conclude that the variables are associated.

If the p-value is not less than the significance level, fail to reject the null hypothesis: We have insufficient evidence to conclude that the variables are associated.​

10.1 The Basic Ingredients for Testing with Categorical Variables

12

10.3 Chi-Square Tests for Associations Between Categorical Variables

The Chi-Square tests we've just discussed fall into one of two groups, depending on how the data was collected.

​A Test of Independence uses a single random sample from which responses for both categorical variables are collected. (This is what we did with our zodiac experiment).

A Test of Homogeneity uses distinct, independent samples ​(one sample for each category of one of the grouping variable) from which only a single response is collected. (This would mean I collected 4 random samples: a sample of water signs, air signs, earth signs, and fire signs, and asked each participant whether they identified as introverted or extroverted).

10.2 The Chi-Square Test for Goodness of Fit

13

10.2 The Chi-Square Test for Goodness of Fit

In the case of the fair or unfair six-sided die, we do not have two categorical variables to study. We have only one category, the result of a die roll (1-6). We no longer use a contingency table, but we are still able to compare observed values against expected values.

In this case, the chi-square test measures how well a frequency distribution (or probability distribution) fits an expected distribution.​

10.2 The Chi-Square Test for Goodness of Fit

14

Goodness of Fit Test

The null hypothesis of a goodness of fit test is simply that a categorical variable fits a particular distribution (for example, a uniform one).

The alternative hypothesis is simply that the variable does not follow that distribution.​

For the die: H0: The die is fair (die rolls are uniformly distributed)​

HA: The die is not fair (die rolls are not uniformly distributed)​

10.2 The Chi-Square Test for Goodness of Fit

15

Goodness of Fit Test

10.2 The Chi-Square Test for Goodness of Fit

Chapter 10

[10.1, 10.2, 10.3]

By Amanda Phillips

Show answer

Auto Play

Slide 1 / 15

SLIDE