Formative assessment strategies that work at scale

Every district in America has a data strategy. Almost none of them have a formative assessment strategy, and the difference matters more than most administrators realize.
Formative assessment is not a product category. It is a practice: the ongoing collection of evidence about student understanding, used to adjust teaching in real time. Summative assessment tells you where students ended up. Formative assessment tells you where they are right now, frequently enough to do something about it.
The research base is large and largely ignored in procurement decisions. Black and Wiliam’s landmark 1998 analysis found effect sizes between 0.4 and 0.7 for formative assessment, among the largest for any educational intervention studied at scale. A 2024 meta-analysis by Yao, Amos, and Brown, synthesizing 118 primary studies across K-12 settings, confirmed a consistent positive effect (Hedges’ g = 0.25). The variable driving the difference is not the assessment itself. It is the feedback loop it creates.
Black & Wiliam (1998), Phi Delta Kappan. Yao et al. (2024), Educational Research and Evaluation.
What does the evidence show for Wayground specifically?
Two independent studies across nearly 13,000 schools found consistent positive correlations between Wayground usage and student outcomes in both Math and ELA, meeting ESSA Level III (Promising Evidence) standards.
The Texas study examined more than 6,600 public schools and 3.5 million student responses during 2023-24. Controlling for prior performance and demographics, it found a positive correlation between Wayground usage and Math and ELA outcomes at grades 3 through 8, across all usage levels. The California study covered 6,280 schools and 4.2 million responses, with positive ELA relationships across all usage levels and significance for Math at low and medium usage.
These are correlational studies, not randomized controlled trials. But a consistent positive signal across 13,000 schools and 7.7 million student responses is meaningful evidence, and qualifying districts can use it for funding under Title I, Title II, and Title IV.
Why aren’t benchmark assessments enough?
Benchmark assessments administered three or four times per year are summative measures with a shorter lookback window. They tell you where students were at the end of a unit, not where they are in the middle of instruction.
The analogy is aviation. A pilot who checks instruments four times per flight would be considered unfit to fly. We would never board that plane. Yet we ask teachers to navigate 28 students through a year of instruction on roughly that cadence of feedback, then express surprise when some students arrive at June having drifted off course.
The timing problem compounds the frequency problem. Benchmark results typically take two to five weeks to return. By then, the unit has concluded. The data is archeological: useful for planning next year, but not actionable for the students sitting in front of you today.
What formative data actually looks like is different in kind, not just degree. It is daily, embedded, and low-stakes. It closes the feedback loop in hours, not weeks.
How does engagement affect formative data quality?
Formative assessment only works if students actually complete it. An assessment that students skip produces no data. An assessment they complete carelessly produces something worse: noise that looks like signal.
Wayground’s assigned activities achieve a 93% average completion rate across paid district classrooms. In a landscape where roughly 1 in 5 students face digital access barriers that prevent homework completion, that number reflects a fundamentally different engagement model and a more valid measurement instrument.
Completion rate: Wayground internal (L90D, paid orgs). Pew Research Center (2021).
Generative AI has accelerated the problem. A 2026 Brookings Institution study found that 65% of students themselves identified “cognitive undermining” as a primary risk of AI use in school. The disengagement problem hasn’t changed in kind. It has changed in magnitude and in how invisible it is to the teacher looking at the data.
Brookings Institution (2026).
What makes Wayground’s approach to formative assessment different?
Wayground was built around a specific hypothesis: formative assessment improves outcomes when it is frequent, when it is engaging enough that students actually try, and when the results are instantly actionable for teachers. Those three conditions form a system.
Frequency means activities run multiple times per week, enough to create a pattern, not isolated data points. Engagement means 93% of students in the room are 93% of students in the data. Instant actionability means teachers see results in real time, with standards-level dashboards available immediately. At the district level, Common Assessment allows every teacher in a grade and subject to administer the same activity and view comparative data across schools within a day.
One piece of this equation is frequently overlooked: accommodations. Formative data is only as representative as the students who can access the assessment. Wayground offers 25-plus accommodations permanently free for all U.S. educators, including extended time, text-to-speech, modified visuals, and Focus Mode. Gaps in accommodation are gaps in your data.
What should districts do next?
No single tool has closed the full loop from engaging student experience to immediate teacher-level data to district-wide visibility. Google Forms solves part of the problem. Paper exit tickets solve part of it. Benchmark windows solve a different part.
That is the gap Wayground was built to close. Read the full ESSA Level III evidence from nearly 13,000 schools, and see how districts are building a real formative assessment practice.


