Reliability, Validity, Utility

Presentation

•

Other

•

Professional Development

•

Practice Problem

•

Easy

Niña Montebon

Used 2+ times

FREE Resource

141 Slides • 57 Questions

Multiple Choice

Which statement about reliability is TRUE?

A) A test must be highly reliable to be valid.
B) A test can be valid even if it has low reliability.
C) A test can be reliable but not valid.
D) Reliability and validity are the same concept.

Multiple Choice

Which of the following is NOT a definition of reliability?

A) Reliability refers to the consistency of test scores obtained by the same persons when they are re-examined with the same test on different occasions, or with different sets of equivalent items, or under varying examining conditions.

B) Reliability is the extent to which a score or measure is free from measurement error. Theoretically, reliability is the ratio of true score variance to observed score variance.

C) Reliability refers to the consistency in measurement; the extent to which measurements differ from occasion to occasion as a function of measurement error.

D) Reliability is the degree to which a test measures what it intends to measure across different conditions and populations.

Multiple Choice

In psychological testing, what does the term "error" primarily refer to?
A) Mistakes made by test administrators
B) Random inaccuracies inherent in measurement
C) Poorly designed test items
D) Differences in individual intelligence levels

Multiple Choice

Which variance should be higher in psychological measurement for a test to be considered reliable?
A) Error variance
B) Environmental variance
C) True variance
D) Test-taker variance

Multiple Choice

Which of the following best defines measurement error?
A) A systematic flaw in a test’s design
B) The difference between an individual’s true score and observed score
C) The extent to which a test measures what it claims to measure
D) A type of error caused by examiner-related variables

Multiple Choice

Which factor is LEAST likely to introduce random error in psychological testing?
A) Fatigue of the test taker
B) Variations in test administration
C) A misprinted question in the test
D) Test-takers’ varying levels of motivation

Multiple Choice

Item sampling or content sampling is considered a source of error variance because:
A) Test-takers may perform differently depending on which items are selected for the test.
B) It is impossible to create test items that measure the same construct.
C) Errors in scoring contribute more significantly to reliability issues than item sampling.
D) Test-takers are always equally familiar with all test items.

Multiple Choice

A test-taker's score is composed of a true score and an error component. What equation best represents this concept?
A) X = T - E
B) X = T / E
C) X = T + E
D) X = E - T

Multiple Choice

Which of the following best defines the Domain Sampling Model in measurement theory?
A) The concept that a test score is based on a limited sample of items from a larger domain.
B) The idea that a test should include all possible items from a domain to be valid.
C) The assumption that all test items should be identical to maintain reliability.
D) The belief that measurement error can be completely eliminated with enough test items.

Multiple Choice

In the Domain Sampling Model, which factor primarily affects the reliability of a test?
A) The total number of test-takers who complete the assessment.
B) The selection of items representing the broader domain of content.
C) The specific order in which test items are presented.
D) The speed at which test-takers complete the assessment.

Multiple Choice

Which type of reliability measures the consistency of scores obtained by the same individuals when tested at different points in time?
A) Internal consistency reliability
B) Parallel forms reliability
C) Inter-rater reliability
D) Test-retest reliability

Multiple Choice

A researcher develops two versions of a psychological test to ensure the consistency of results across different test forms. Which type of reliability is being assessed?
A) Internal consistency reliability
B) Inter-rater reliability
C) Parallel forms reliability
D) Test-retest reliability

Multiple Choice

A psychologist splits a test into two halves and measures the correlation between them to determine reliability. What is this method called?
A) Parallel forms reliability
B) Test-retest reliability
C) Inter-rater reliability
D) Split-half reliability

Multiple Choice

What does internal consistency reliability primarily measure?
A) The degree to which test scores remain stable over time
B) The extent to which test items measure the same underlying construct
C) The similarity of results obtained from two equivalent forms of a test
D) The consistency of scores given by different raters

Multiple Choice

A student takes a personality test twice, six weeks apart, and receives significantly different scores. What does this suggest about the test?
A) It has low test-retest reliability
B) It has strong parallel forms reliability
C) It has high internal consistency
D) It has excellent inter-rater reliability

Multiple Choice

Which statistical method is most commonly used to estimate internal consistency reliability?
A) Pearson correlation coefficient
B) Spearman’s rank correlation
C) Cronbach’s alpha
D) Cohen’s kappa

Multiple Choice

A researcher wants to assess the reliability of scores given by multiple raters. Which statistic should be used?
A) Cronbach’s alpha
B) Cohen’s kappa
C) Spearman-Brown formula
D) Standard error of measurement

Multiple Choice

A test developer wants to determine how much the reliability of a test would change if its length were increased. Which statistical formula should be applied?
A) Spearman-Brown formula
B) Standard error of measurement
C) Cohen’s kappa
D) Intraclass correlation coefficient

Multiple Choice

Which statistical method is used to estimate the internal consistency of a test with dichotomous (right/wrong) items?
A) Cronbach’s alpha
B) Cohen’s kappa
C) KR-20
D) Intraclass correlation coefficient

Multiple Choice

When would the Spearman-Brown formula be more appropriate to use than the KR-20 formula?
A) When estimating how reliability would change if a test were shortened or lengthened.
B) When measuring the internal consistency of a test with dichotomous items.
C) When evaluating the inter-rater reliability of a scoring system.
D) When determining the impact of random error on test performance.

Multiple Choice

Which type of reliability is best assessed using test-retest procedures?
A) Internal consistency reliability
B) Inter-scorer reliability
C) Stability reliability
D) Criterion-referenced reliability

Multiple Choice

Which measure of reliability assesses the degree to which test items measure the same construct?
A) Test-retest reliability
B) Inter-rater reliability
C) Internal consistency reliability
D) Stability reliability

Multiple Choice

If a researcher wants to measure internal consistency for a test with items scored in different ways (e.g., Likert scale), which statistic is most appropriate?
A) Spearman-Brown formula
B) Coefficient Alpha (Cronbach’s Alpha)
C) Kuder-Richardson 20 (KR-20)
D) Inter-scorer reliability coefficient

Multiple Choice

How does the nature of a psychological trait influence reliability estimates?
A) Dynamic traits tend to yield lower reliability due to their variability over time.
B) Static traits are always measured with lower reliability than dynamic traits.
C) Traits that change frequently are more reliable in longitudinal studies.
D) Reliability estimates remain the same regardless of whether a trait is dynamic or static.

Multiple Choice

Why do criterion-referenced tests sometimes yield lower reliability coefficients than norm-referenced tests?
A) Criterion-referenced tests aim to classify test-takers rather than differentiate their scores widely.
B) Criterion-referenced tests use items with equal difficulty levels, leading to unstable scores.
C) Criterion-referenced tests are not designed to measure the same construct across different groups.
D) Criterion-referenced tests always contain more measurement error than norm-referenced tests.

Multiple Choice

When evaluating the reliability of a test, which reliability coefficient is considered acceptable for high-stakes decisions such as hiring or clinical diagnosis?
A) 0.50 or higher
B) 0.60 or higher
C) 0.70 or higher
D) 0.90 or higher

Multiple Choice

What is the best course of action if a test has a low reliability coefficient?
A) Increase the number of items measuring the same construct.
B) Reduce the number of items to avoid redundancy.
C) Assume the test is valid despite the low reliability.
D) Use the test only for research purposes without interpreting individual scores.

Open Ended

When do we say that a test is valid?

Type response here

Open Ended

Is a valid test also reliable?

Type response here

Multiple Choice

Which type of validity is most concerned with whether a test covers the relevant subject matter?
A) Construct validity
B) Criterion-related validity
C) Content validity
D) Face validity

Multiple Choice

What distinguishes construct validity from other forms of validity?
A) It focuses on whether the test looks appropriate to test-takers.
B) It evaluates the relationship between test scores and future performance.
C) It assesses whether the test fairly represents all groups taking it. I
D) It involves a comprehensive analysis of how test scores fit into a theoretical framework.

Multiple Choice

Why is construct validity often referred to as "umbrella validity"?
A) It covers all other forms of validity by integrating multiple types of evidence.
B) It is the easiest type of validity to establish statistically.
C) It applies to tests measuring abstract psychological traits.
D) It ensures a test remains valid regardless of the population being tested.

Multiple Choice

When validating a test for hiring decisions, which type of validity is most relevant?
A) Face validity
B) Criterion-related validity
C) Content validity
D) Internal consistency validity

Multiple Choice

If a test measures a psychological trait that changes over time, what impact might this have on its validity?
A) It may have low reliability but strong validity.
B) Its validity may be limited to specific timeframes or conditions.
C) The test’s validity will not be affected as long as it measures consistently.
D) The validity of the test will automatically increase over time.

Multiple Choice

A test developer is responsible for which aspect of validity?
A) Providing evidence to support the test’s validity in the test manual
B) Ensuring that all test-takers achieve similar scores
C) Making sure the test is widely used before proving its validity
D) Guaranteeing the test remains valid for all populations and purposes

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

Multiple Choice

How is content validity typically assessed?
A) By comparing test scores with an external criterion
B) By computing a validity coefficient
C) By expert judgment and systematic examination of test content
D) By conducting factor analysis on the test item

118

Multiple Choice

When a new test correlates moderately with an already validated test measuring the same construct, this is an example of:

A. Discriminant validity
B. Predictive validity
C. Convergent validity
D. Reliability

119

Multiple Choice

If subscales within a test do not correlate well with the total score, this suggests issues with:

A. Evidence from distinct groups
B. Homogeneity
C. Convergent validity
D. Discriminant validity

120

Multiple Choice

A spelling test that only assesses the ability to recognize misspelled words but is used to claim that students have strong overall spelling skills lacks:
A) Construct validity
B) Predictive validity
C) Content validity
D) Incremental validity

121

Multiple Choice

Which of the following is NOT an example of a psychological construct?

A. Self-esteem
B. Job satisfaction
C. Blood pressure
D. Leadership ability

122

Multiple Choice

Which of the following best illustrates discriminant validity?

A. A self-esteem test correlates highly with an established self-worth test.
B. A new leadership ability test correlates moderately with a validated leadership test.
C. A personality test produces consistent results over time.
D. A test measuring anxiety shows no significant correlation with a test measuring extraversion.

123

Multiple Choice

Which of the following is an example of predictive validity?
A) A personality inventory aligns with expert psychiatric diagnoses
B) A math aptitude test correlates with students' final math grades months later
C) A reading comprehension test correlates highly with another reading test administered at the same time
D) A vocabulary test consistently produces similar scores over multiple administrations

124

Multiple Choice

If a test incorrectly identifies a student as highly skilled in logical reasoning when they are not, this is an example of:
A) False negative
B) False positive
C) Incremental validity
D) Base rate error

125

Multiple Choice

If a newly created anxiety test correlates too highly (e.g., r = 0.95) with an existing anxiety measure, this suggests that:

A. The new test lacks construct validity.
B. The new test is unnecessarily duplicating the existing measure.
C. The test demonstrates strong discriminant validity.
D. The test lacks predictive validity.

126

127

128

Open Ended

Can you think of some synonyms of "bias" or "biased"?

Type response here

129

130

131

Open Ended

When do we say that bias exists in testing?

Type response here

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

Multiple Choice

A rater who avoids giving extremely high or low ratings, instead placing all scores in the middle, is exhibiting:

A. Halo effect
B. Central tendency error
C. Severity error
D. Leniency error

155

Multiple Choice

If a judge gives a gymnast a much lower score than deserved after witnessing an exceptional performance by the previous competitor, this demonstrates:

A. Contrast effect
B. Leniency error
C. Central tendency error
D. Test bias

156

Multiple Choice

A teacher who gives a student consistently high ratings in all subjects simply because the student excels in one subject is displaying which rating error?

A. Severity error
B. Leniency error
C. Halo effect
D. Central tendency error

157

Multiple Choice

A recruiter is extremely strict and gives all job applicants low scores on an interview assessment. This is an example of:

A. Leniency error
B. Severity error
C. Central tendency error
D. Halo effect

158

Multiple Choice

How can raters minimize rating errors?
A. By considering both subjective impressions and structured criteria when making judgments
B. By focusing primarily on their past experiences rather than standardized guidelines
C. By using well-defined rating scales and participating in training to recognize and reduce bias
D. By aligning their ratings with the average scores given by other raters to maintain consistency

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

Multiple Choice

Which factor most directly influences the utility of a psychological test?
A. The test’s reliability, validity, and cost-effectiveness in decision-making
B. The length of the test and its number of items
C. The number of people taking the test annually
D. The ease with which test-takers understand the questions

193

Multiple Choice

A psychological test with high validity but low utility is likely to be:
A. Accurate but not cost-effective in real-world applications
B. Useless in measuring the intended construct
C. A test that has high face validity but low reliability
D. The most preferred assessment tool in applied settings

194

Multiple Choice

In a utility analysis, which of the following is most important in determining the cost-effectiveness of a test?
A. The time taken to administer the test
B. The test’s ability to improve decision-making outcomes
C. The ease of interpreting the test results
D. The test-takers' subjective satisfaction with the assessment

195

Multiple Choice

Which of the following is a method used to estimate the financial impact of using a test for selection decisions?
A. Content analysis
B. Brogden-Cronbach-Gleser (BCG) Model
C. Parallel forms reliability analysis
D. Thematic coding

196

Multiple Choice

Which method of setting cut scores involves expert judgment to classify test-takers into performance categories?
A. The Angoff Method
B. The Test-Retest Method
C. The Item-Response Theory (IRT) Approach
D. The Split-Half Method

197

198

Open Ended

Psychological assessments are evaluated based on three key psychometric properties: reliability, validity, and utility. Explain how these concepts interrelate in the context of psychological testing. Provide examples of situations where a test may be reliable but not valid, valid but not useful, and useful but not highly reliable. Discuss the potential consequences of using a test that lacks one or more of these properties in real-world settings such as education, employment, or clinical diagnosis

Type response here

Show answer

Auto Play

Slide 1 / 198

SLIDE

Similar Resources on Wayground

181 questions

untitled

Lesson

•

KG - University

180 questions

La Importancia Concientizar

Lesson

•

University

181 questions

hóa 11

Lesson

•

11th Grade

173 questions

Vocabulaire et salutation

Lesson

•

KG - University

183 questions

iCEV Vet Assisting - Vet Terms Lesson

Lesson

•

9th - 12th Grade

189 questions

Special Pops: Final Exam Review

Lesson

•

University

200 questions

PHÁP CHẾ

Lesson

•

University

205 questions

Medicina Interna - Dr. Erick Ojeda

Lesson

•

University - Professi...

Popular Resources on Wayground

15 questions

Fractions on a Number Line

Quiz

•

3rd Grade

10 questions

Probability Practice

Quiz

•

4th Grade

15 questions

Probability on Number LIne

Quiz

•

4th Grade

20 questions

Equivalent Fractions

Quiz

•

3rd Grade

25 questions

Multiplication Facts

Quiz

•

5th Grade

$fractions$

22 questions

fractions

Quiz

•

3rd Grade

6 questions

Appropriate Chromebook Usage

Lesson

•

7th Grade

10 questions

Greek Bases tele and phon

Quiz

•

6th - 8th Grade

Discover more resources for Other

20 questions

Black History Month Trivia Game #1

Quiz

•

Professional Development

20 questions

90s Cartoons

Quiz

•

Professional Development

12 questions

Mardi Gras Trivia

Quiz

•

Professional Development

7 questions

Copy of G5_U5_L14_22-23

Lesson

•

KG - Professional Dev...

12 questions

Unit 5: Puerto Rico W1

Quiz

•

Professional Development

42 questions

LOTE_SPN2 5WEEK2 Day 4 We They Actividad 3

Quiz

•

Professional Development

15 questions

Balance Equations Hangers

Quiz

•

Professional Development

31 questions

Servsafe Food Manager Practice Test 2021- Part 1

Quiz

•

9th Grade - Professio...