The dark side of education research: widespread bias

Critics have attacked Big Pharma for widespread biases in studies of new and potentially profitable drugs. Now, scholars are detecting the same type of biases in the education product industry — even in a federally curated collection of research that’s supposed to be of the highest quality. And that may be leaving teachers and school administrators in the dark about the full story of classroom programs and interventions they are considering buying.

An analysis of 30 years of educational research by scholars at Johns Hopkins University found that when a maker of an educational intervention conducted its own research or paid someone to do the research, the results commonly showed greater benefits for students than when the research was independent. On average, the developer research showed benefits — usually improvements in test scores — that were 70 percent greater than what independent studies found.

“I think there are some cases of fraud, but I wouldn’t say it’s fraud across the board,” said Rebecca Wolf, an assistant professor in the Center for Research and Reform in Education at Johns Hopkins University and lead author of the draft study. “Developers are proud of their products. They believe in them. They’ve worked hard in developing these products. They want a study that puts the best face forward.”

Biased research matters because current federal law encourages schools to buy products that are backed by science. In order to tap into federal school improvement funds, for example, low-achieving schools with disadvantaged children are required to select programs that have been rigorously tested and show positive effects.

The study, “Do Developer-Commissioned Evaluations Inflate Effect Sizes?” was presented at a March 2019 conference session of the Society for Research on Educational Effectiveness (SREE) in Washington, D.C. The paper is a working paper, meaning it has not yet been published in a peer-reviewed journal and may still be revised.

Wolf and three of her colleagues analyzed roughly 170 studies in reading and math dating as far back as 1984 that are part of the What Works Clearinghouse. That’s an archive of research that the U.S. Department of Education launched in 2002 to help educators decide which educational products to buy. It is by no means a complete or an exhaustive collection of educational research but a group of high quality studies curated by experts. The studies track test score gains and compare students who got the intervention with those who didn’t.

More than half, or 96, of the studies were conducted by independent researchers while 73 of them had some sort of insider connection with creating or selling the product. Wolf labeled the research a “developer” run or funded study if the inventors, distributors or an employee of the developer or distributor were involved in the research. Studies were considered developer studies even if the developer didn’t directly conduct the research but commissioned an outside researcher to carry out the study.

Wolf took many aspects of the studies that can lead to bigger student gains into consideration. For example, a personal tutor tends to produce larger student gains than a curriculum used by an entire classroom. Kids in younger grades tend to see bigger improvements than older kids. Smaller studies on fewer students are more likely to show a bigger bang than larger ones. But even within a host of subcategories, Wolf found that the developer studies still pointed to larger benefits than the independent studies.

Replication studies are relatively rare in education research but both developer and independent studies were available for 18 of the reading and math interventions. When Wolf compared these independent and the developer studies side by side, the developer studies tended to post 80 percent higher gains for students for the same educational product.

There are a number of reasons for why developer studies tend to show stronger results, according to Wolf, whose full time work is to evaluate educational programs. The first is that a company is unlikely to publish unfavorable results. Wolf speculates that developers are more likely to “brand a failed trial a ‘pilot’ and file it away.”

A second common issue is how students are kept out of experiments. Timothy Shanahan, a reading specialist and a professor emeritus at the University of Illinois at Chicago, shared an anecdote before attendees of the March 2019 SREE conference. He recalled a reading study where struggling students who didn’t complete the program were excluded from the treatment group. The comparison control group, of course, kept the low achieving readers and their low scores, making the intervention look more successful. Wolf also found these sort of “sample selection” differences when she compared developer and independent studies side by side. One developer study decided to exclude some students from the treatment group after randomly being selected for it. These details are often in the study’s fine print but educators would have to look for them.

Developers often create their own yardsticks for measuring student success, devising their own assessments to go along with their programs. That might allow an education product company to measure what they’re teaching more precisely. But those same gains are often not evident in a reading or math assessment given to all students each spring.

These research choices that lead to bias seem to be an open secret in education research circles. Wolf said she asked researchers who heard her presentation if they were surprised by her conclusions. “Every single person said ‘no.’ If you’re in the work of program evaluation, you can see why these things might happen,” said Wolf.

This isn’t the first study to detect bias in education research. The problem of hiding unfavorable results from publication was documented as far back as 1995. In 2016, one of Wolf’s co-authors, Robert Slavin, wrote about the positive results that researchers get when they devise their own measures to prove that their inventions work. In that same year, another group of researchers also detected a developer bias in a smaller group of studies about math programs that are part of the What Works Clearinghouse collection. This new Hopkins study addresses some questions about that analysis and confirms the conclusion that when people study their own inventions, the results are stronger.

Solving this bias problem won’t be easy. Some advocate for pre-registration, something that the field of medicine uses, in which study authors describe the design and measures to be used ahead of time. SREE launched such a registry in 2018. That makes it harder for developers to tweak their study design on the fly when the students aren’t faring as well as they had hoped. However, schools are complex places and it’s often necessary to make adjustments to an experiment when something isn’t working with teachers or school-day schedules.

Wolf argues that educators should pay more attention to whether the research is independent. In her research for this study, developer funding wasn’t always disclosed and she often had to contact researchers to learn these details. Wolf said these conflicts of interest should be highlighted and disclosed up front.

Sunlight is a remedy just as in the pharmaceutical industry.

This story about education research was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Hechinger newsletter.

One reply on “The dark side of education research: widespread bias”

At The Hechinger Report, we publish thoughtful letters from readers that contribute to the ongoing discussion about the education topics we cover. Please read our guidelines for more information. We will not consider letters that do not contain a full name and valid email address. You may submit news tips or ideas here without a full name, but not letters.

By submitting your name, you grant us permission to publish it with your letter. We will never publish your email address. You must fill out all fields to submit a letter.

Richard Hinkie says:

April 22, 2022 at 2:16 pm

Caroline Prestons article in USA today, April 22 recounts the difficulties in learning math online. The challenge is fundamentally that online learning has not taken advantage of other effective models. I am speaking specifically of changing the role of the teacher/professor from lecturing to coaching with the materials created in the kind of quality that popular computer games are created in. In other words why should thousands of math teachers prepare online lectures and basic algebra when very high-quality, very engaging, computer-based learning can be the source of the key concepts and the teacher becomes the coach helping the students apply and, frankly, enjoy the computer graphics, animations, music, and other entertainment related features. And example of this is the utility industry. They banded together and created energy university using high-level computer-based animations and programmed learning to teach the skills of field electrical and gas energy tasks. Yes, it cost a lot of money to create one high-quality computer-based learning module. But once created millions of people can take the same course. Initially the utility industry trainers were resistant because they achieved significant satisfaction from teaching. And I love teachers from my 94-year-old mother to the many other teachers in my family. However, to make the computer learning system truly 21st century we must apply and maybe even approach the computer gaming industry so that they can apply their significant skills to teaching basic concepts such as math. Clearly such a model is not appropriate for every subject. But it can be for fact based, even skill-based learning. Now the challenge is twofold. Change the teaching model to a coaching model which is no small task. As this letter is written, teachers are being taught in colleges and universities across the country to be teachers not coaches. Secondly the traditional educational computer-based companies are not equipped to accomplish this transformation. Let’s go back to the utility model electricity can be dangerous. Field utility employees need to be able to turn on and off major electrical systems. They need to learn how to take apart equipment that is worth hundreds of thousands of dollars. That can be done physically in every utility across the country or as they have decided, the preparation for the hands-on portion of learning can all be done on the computer with animations where the employee can literally take apart with their mouse and reassemble very expensive equipment and if they make mistakes they can be “electrocuted“ in a safe, almost game like environment. So my encouragement to the heckinger Report is to seriously investigate my observation. Should a reporter wish to investigate the utility system and I believe that I can arrange a demonstration with the organization that currently managers that system. It is the same organization that I retired from and I’m proud to have had a major role in creating it. I seek no financial or other reward at all from this suggestion. It is simply that America cannot afford outdated learning systems when they exist. Given that I have four grandchildren I am deeply committed to their educational growth and to those of all children everywhere.

Letters are closed