What Do We Know About Innovative Assessment?
A Look at Its Characteristics, Process and Purpose
This is the second in a series of posts by our 2024 summer interns, based on the assessment and accountability projects they designed with their Center mentors. Joy Zhang, a doctoral student at Brigham Young University, worked with the Center’s executive director, Scott Marion.
States, districts, and test publishers put a lot of effort into researching, designing, and implementing innovative assessments for many reasons: to make them better gauges of what students know and can do, to gain flexibility, or to reduce their cost or testing time. The federal Innovative Assessment Demonstration Authority encourages states to try more “innovative” ways of assessing their students’ achievements.
But is “innovative assessment” just a buzzword? Or are states creating standardized tests that are truly innovative? To answer that question, we must first pause and ask: what is innovation? Some people may say that innovation means just being new and different. But is that enough? In a recent blog post about innovative assessment, Scott Marion highlighted two words: “breakthrough” and “scale.” Marion borrowed those words from McKinsey & Company’s definition of innovation, drawn from the business world. In Marion’s analysis, if assessments are truly innovative, they must represent breakthroughs and be put into practice on a large scale.
But are these the only two defining traits of innovation? And are there universal traits of innovation that hold true across all fields?
After reviewing the literature and resources—blogs, videos, website information—from the fields of business, medicine, and construction, I identified some common characteristics of innovation and clustered them into groups according to their characteristics, process, and purpose.
The Characteristics of Innovation
- Innovation can’t just be new; it must be a breakthrough. Recycling existing ideas or products isn’t innovative. Innovation can be a tangible product, an idea, a plan, or a process. However, something new and significant must be added to produce dramatic, sustainable change to be considered a breakthrough.
- Innovation is human-centered. Designs that are truly innovative serve human beings. Consider the iPhone and penicillin: two breakthroughs that fundamentally changed humans’ life experience at scale. Even innovations that benefit non-humans—to protect wildlife or the oceans, for instance—still ultimately benefit human life in the long run.
- Innovation is actionable. If innovative ideas are to serve their purposes, the current social structure and technology must be able to support—or expand to support—their enactment.
The Process of Innovation
- Cost-effectiveness and risk management are also important to innovation. While it challenges norms with its breakthroughs, innovation must also reduce uncertainty to manageable levels and consider its potential risks. These considerations are key to making the outcome realistic and actionable.
- The innovation process is iterative. Innovation is rarely a linear process (research, development, production, and marketing). More often, it resembles a loop, in which people take feedback from the previous stage of development and incorporate it in the next cycle until the product or idea is final.
The Purpose of Innovation
- The ultimate goal of innovation is to meet human needs, either current or potential, to improve human lives.
- Innovation aims to bring structural change; the way the smartphone changed information and communication is an example. The goal of innovation is to transform society with new insights, actions, or products.
Have assessments ever been innovative?
Yes, they have. People keep trying out new ideas and designs to meet their needs in assessment. Criterion-referenced testing (CRT) is an example since it has largely replaced norm-referenced testing (NRT). Criterion-referenced testing gave rise to standards-based assessment. Importantly, CRT reflected a sea change in assessment because it measures student achievement against clearly defined knowledge and skills (i.e., the criterion) rather than comparing students to one another. These test results provide educators and leaders with information they can use to evaluate and improve programs and—hopefully—improve student learning.
Computer-adaptive testing (CAT) emerged in the early 1980s after the development of item-response theory. It is another example of assessment innovation because it adjusts the item difficulty level to the test-taker. CAT, which has many variations, makes testing more efficient because the assessment hones in on the student’s achievement level (“ability” in CAT terminology) more quickly and precisely than fixed-form tests.
Assessments based on alternate achievement standards for students with significant cognitive disabilities are relatively new—only about 25 years old—but likely meet the criteria for innovation. When they were introduced, many people (except for advocates and some flexibly minded assessment specialists) did not think the students included in these assessment programs could learn academic content. These assessments provided clear documentation to the contrary. Further, these assessments continue to evolve to better tap into what these students know and are able to do.
These three examples satisfy all the characteristics of innovation we’ve discussed here, such as being breakthroughs at scale, human-centered, efficient, and cost-effective.
Some people may argue that performance-based assessment and Diagnostic Classification Models are innovations. I agree that they are innovative, but I’d question whether some of my criteria apply; in particular, are they actionable and cost-effective? Have they been undertaken at scale?
The validity and usefulness of performance-based assessments have been well established. Yet, they are still not implemented on a large scale, primarily because of the time and cost required to administer and score them. Diagnostic Classification Models, meanwhile, haven’t been widely adopted, so they have not brought about structural change. We can’t call an assessment “innovative” simply because it is new and different.
Through-year and AI-driven assessment
Through-year assessments and AI-powered assessments are getting a lot of attention and being called “innovative” right now, but I’ll challenge that characterization. Let’s look at these assessments using the standards of innovation we just discussed: are they new breakthroughs? Are they actionable and cost-effective? Have they been undertaken at scale in the assessment world?
Through-year and AI-powered assessments are potential breakthroughs in the assessment world. But we haven’t seen enough evidence yet when it comes to being actionable and cost-effective.
AI-powered assessments seem promising and powerful since they could transform the assessment world by generating large numbers of high-quality items quickly (efficiency) and increasing human rater reliability. However, AI still carries risks with potentially detrimental consequences. Can AI-powered tests lead to structural improvement? It’s too soon to say. I hope that AI will lead to innovations if we can address the many concerns that have been raised.