
Contingency Tables
Presentation
•
Mathematics
•
University
•
Hard
Joseph Anderson
FREE Resource
15 Slides • 0 Questions
1
Chapter 10
[10.1, 10.2, 10.3]
By Amanda Phillips
2
10.1 The Basic Ingredients for Testing with Categorical Variables
Consider this: How would you determine if a regular six-sided game die was fair? What would you look for if suspected that the die were not fair?
10.1 The Basic Ingredients for Testing with Categorical Variables
3
Testing a Probability Distribution
On that die, there are 6 different outcomes, which are supposed to be equally likely. If the die is not fair, then some values will come up more often than expected, and others will come up less often than expected. Then, the outcomes are not equally likely.
Of course, we know from experience that you won't always roll exactly the same number of 1s and 2s and 3s and so on. There will be some variation. The question, then, is whether the empirical probability distribution of the die is significantly different than the theoretical probability distribution. We know that hypothesis tests can help us to answer this!
10.1 The Basic Ingredients for Testing with Categorical Variables
4
Chi-Square Tests
A Chi-Square Test is a hypothesis test that determines whether a probability distribution is significantly different than some expected or claimed distribution. We use a chi-square test statistic that falls in a chi-square distribution in this test.
Chi-Square tests help us to determine if two categorical variables are associated with one another, especially where at least one of those variables has is nonbinary. For example, perhaps we wish to know whether economic status (low, middle, or upper class) is associated with political alignment (left, left-leaning, right-leaning, or right). Notice that we couldn't simply run a proportion test, as there are 12 different proportions to consider!
10.1 The Basic Ingredients for Testing with Categorical Variables
5
Chi-Square Tests
The null hypothesis of the test claims that the categorical variables studied are not associated with one another. The alternative hypothesis claims that they are associated.
The test is based on comparisons between what is expected (theoretically) and what is true (observed). We organize this information in a contingency table (also called a two-way table).
NOTE: The two-way tables we studied at length when we first covered probability were contingency tables.
A contingency table is used to summarize the observed results of a sample, which we will compare to the expected results of a sample.
10.1 The Basic Ingredients for Testing with Categorical Variables
6
Contingency Tables and Expectations
Suppose you wanted to know if someone's zodiac sign and sociality are associated. We consider 4 categories of zodiac signs (water, air, earth, and fire) against 2 social categories (introverted and extroverted).
Let's do an experiment!
10.1 The Basic Ingredients for Testing with Categorical Variables
7
10.1 The Basic Ingredients for Testing with Categorical Variables
Observed results are collected from the class. Expected counts (which we will display in parentheses) are based on what should be true if the categorical variables of zodiac sign and sociality are not associated.
According to "Psychology Today," 50-74% of the population is extroverted! Let's play it safe and assume that 62% of the population is extroverted, and that the other 38% of the population are introverted. If zodiac sign and sociality are not associated, then these proportions should apply to each of our signs as well. We use this to determine the expected counts.
8
Chi-Square Tests
10.1 The Basic Ingredients for Testing with Categorical Variables
9
Chi-Square Tests
Once we have a test statistic, we use a Chi-Square distribution to measure our p-value. We use the distribution in the same way that we use a normal distribution, measuring the p-value of obtaining a test statistic as extreme or more extreme than the one obtained, but you may notice something odd about the Chi-Square distribution!
10.1 The Basic Ingredients for Testing with Categorical Variables
10
Chi-Square Tests
The Chi-Square distribution has no left tail. This is because a chi-square test statistic can't be negative. So this is always a right-tailed test!
And, we have degrees of freedom similar to those used in the t-distribution. This time, the degrees of freedom is one less than the number of categories considered in the test (between both variables).
In our experiment, there are 8 categories.
10.1 The Basic Ingredients for Testing with Categorical Variables
11
Chi-Square Tests
Once we have a p-value, we compare it to our significance level and determine whether to reject the null hypothesis or fail to do so.
If the p-value is less than the significance level, reject the null hypothesis: We have sufficient evidence to conclude that the variables are associated.
If the p-value is not less than the significance level, fail to reject the null hypothesis: We have insufficient evidence to conclude that the variables are associated.
10.1 The Basic Ingredients for Testing with Categorical Variables
12
10.3 Chi-Square Tests for Associations Between Categorical Variables
The Chi-Square tests we've just discussed fall into one of two groups, depending on how the data was collected.
A Test of Independence uses a single random sample from which responses for both categorical variables are collected. (This is what we did with our zodiac experiment).
A Test of Homogeneity uses distinct, independent samples (one sample for each category of one of the grouping variable) from which only a single response is collected. (This would mean I collected 4 random samples: a sample of water signs, air signs, earth signs, and fire signs, and asked each participant whether they identified as introverted or extroverted).
10.2 The Chi-Square Test for Goodness of Fit
13
10.2 The Chi-Square Test for Goodness of Fit
In the case of the fair or unfair six-sided die, we do not have two categorical variables to study. We have only one category, the result of a die roll (1-6). We no longer use a contingency table, but we are still able to compare observed values against expected values.
In this case, the chi-square test measures how well a frequency distribution (or probability distribution) fits an expected distribution.
10.2 The Chi-Square Test for Goodness of Fit
14
Goodness of Fit Test
The null hypothesis of a goodness of fit test is simply that a categorical variable fits a particular distribution (for example, a uniform one).
The alternative hypothesis is simply that the variable does not follow that distribution.
For the die: H0: The die is fair (die rolls are uniformly distributed)
HA: The die is not fair (die rolls are not uniformly distributed)
10.2 The Chi-Square Test for Goodness of Fit
15
Goodness of Fit Test
10.2 The Chi-Square Test for Goodness of Fit
Chapter 10
[10.1, 10.2, 10.3]
By Amanda Phillips
Show answer
Auto Play
Slide 1 / 15
SLIDE
Similar Resources on Wayground
10 questions
Lesson Dime
Presentation
•
KG
10 questions
MEE474 Chapter 12
Presentation
•
University
11 questions
Logarithms Review
Presentation
•
University
11 questions
Income Tax Brackets
Presentation
•
12th Grade
10 questions
Pythagorean Theorem Homework
Presentation
•
8th Grade
10 questions
PRESENT CONTINUOUS
Presentation
•
University
12 questions
Health problems
Presentation
•
University
10 questions
Hypothesis Testing
Presentation
•
University
Popular Resources on Wayground
10 questions
GPA Lesson
Presentation
•
9th - 12th Grade
7 questions
Albert Einstein
Quiz
•
3rd Grade
31 questions
Bridge A Review
Quiz
•
3rd Grade
6 questions
Blue Sue and Red Ruth
Quiz
•
3rd Grade
8 questions
(Day12 HW) Inverse Trig Ratios
Quiz
•
9th Grade
20 questions
Summer Geometry QUIZ (Week3)
Quiz
•
9th Grade
16 questions
Theme Practice
Quiz
•
7th Grade
20 questions
Taxes
Quiz
•
9th - 12th Grade