Teacher Effectiveness

Evaluating teachers: Precise but irrelevant metrics?

I’ve told this joke before:

Two hot-air balloonists get lost, and they’re floating aimlessly. They spot someone down below and call out, “Hello!”

The person on the ground replies, “Hello!”

“Where are we?” one calls down.

Up comes the reply: “You’re in a balloon!”

They continue to drift, when one of the balloonists says to the other, “Who was that?”

The other responds, “That was obviously an economist.”

“An economist? How can you tell?” the first asks.

“Because what he said was precise, but irrelevant.”

Unfair to economists? Of course! But surely in keeping with the mongoose-cobra relationship that characterizes sociologists and economists. (And some of my best friends, etc., etc.) A case in point:

Earlier this week, FiveThirtyEight, founded by data whiz Nate Silver, posted a feature on the application of value-added models to the evaluation of K-12 teachers. Quantitative editor Andrew Flowers argued that a key part of the debate is over, and that recent studies have converged on the finding that value-added measures accurately predict students’ future test scores. The article cites all of the usual suspects: Raj Chetty, John Friedman, Jonah Rockoff, Jesse Rothstein, Tom Kane, and Doug Staiger. Thoughtful and creative economists one and all, armed with an arsenal of quantitative methods and administrative data to which to apply them.

Related: New York teachers hate the idea of outsiders evaluating them. Here’s what happened when D.C. tried it

The debate has hinged on the fact that students are usually not randomly assigned to teachers, and thus one can never be sure that differences among teachers in their students’ test scores are due to the influence of the teacher, rather than to unmeasured differences in the attributes of students or of a classroom.

The key technical issue is the ability of quasi-experimental statistical models to reproduce the results that are observed in the handful of randomized experiments that provide the strongest evidence of the causal effects of high value-added scores.

Evidence is accruing that such models yield similar results, implying that value-added models can identify teachers who are indeed better at raising students’ test scores. I don’t think this precludes an unscrupulous principal from assigning challenging students to a teacher in the hope that the teacher will fail, and obtain a low value-added score; however, the models are not designed to illuminate specific cases, but rather to reveal trends across many teachers and classrooms.

Related: Study calculates low-income, minority students get the worst teachers in Washington State

It’s not controversial to argue that some teachers are more skilled or effective than others, and that some are better at boosting their students’ scores on standardized tests. And in a society that relies so heavily on tests of all kinds to certify and select people, it’s quite possible that exposure to one teacher versus another could have long-lasting effects on students’ lives.

But even if these points are settled, they’re largely irrelevant to the design of teacher evaluation systems. As education researcher Stephen Raudenbush of the University of Chicago—who I will proudly claim as a sociologist and former colleague—asks in the March 2015 issue of Educational Researcher, “Does the answer to a precisely focused research question, by itself, have implications for practical action?” He goes on to argue, “Research on value added has no implications for action in isolation from other research about effective schooling because, like any research program, the narrow conditions that make value-added research convincing limit its direct applicability in practice.”

This point, inscribed throughout this special issue of Educational Researcher on teacher value-added models and educational practice, emphasizes the political and organizational challenges in designing teacher evaluation systems that yield ratings that are transparent and fair, offer information on which teachers can act to improve their practice, and are devoid of unintended consequences that might disrupt a school’s capacity to promote student learning.

Related: Researchers give failing marks to national effort to measure good teaching

It’s at the level of the school building that most of the action around teacher evaluation and its consequences occurs, and truth be told, most economists are not devoting much attention to the interior of the school or the social relations among school leaders, teachers and students. Sociologists and other education researchers may not have a common vocabulary to describe these social relations, and the technology for modeling and prediction is not as elaborate. But the research agenda is relevant, if less precise.

An arsenal is only useful if directed at the right target.

This story was produced by The Hechinger Report, a nonprofit, independent news website focused on inequality and innovation in education. Read more about teacher effectiveness.

Letters

Aaron Pallas

Aaron Pallas is Professor of Sociology and Education at Teachers College, Columbia University. He has also taught at Johns Hopkins University, Michigan State University, and… See Archive

Letters to the Editor

Send us your thoughts

At The Hechinger Report, we publish thoughtful letters from readers that contribute to the ongoing discussion about the education topics we cover. Please read our guidelines for more information.

By submitting your name, you grant us permission to publish it with your letter. We will never publish your email. You must fill out all fields to submit a letter.





No letters have been published at this time.