Get important education news and analysis delivered straight to your inbox
Striking teachers in Chicago are fighting a contentious education reform that could overhaul how teachers are paid and evaluated, highlighting the difficulty of judging teachers by the performance of their students.
While the debate plays out dramatically in Illinois, new teacher evaluation systems have created conflict in other states, including Florida and Tennessee, which now use students’ standardized test scores in their evaluations of teachers. And the stakes of such evaluations are increasing in many places, with personnel decisions often hinging on the results.
A 2010 law passed in Illinois requires that all schools in the state adopt a new evaluation system by the 2016-2017 school year. In Chicago, student “growth”—or improvement—on standardized tests will count for at least 25 percent of a teacher’s evaluation, a system that Chicago Teachers Union President Karen Lewis has called “unacceptable.”
Lewis says that the new system will place undue emphasis on scores affected by student factors outside of a teacher’s control, like poverty and homelessness.
Her concerns echo those voiced by teachers around the country, who argue that student growth measures are unproven and should not be used in decisions about tenure and layoffs. Proponents say the new systems are far superior to those of the past and hold teachers accountable for how much students do—or do not—learn in their classrooms.
For decades, teacher evaluations were based on infrequent, scheduled observations. In most cases, teachers would be deemed either satisfactory or unsatisfactory, and unsatisfactory ratings were rare.
Chicago’s evaluation system, developed in the 1970s, is based on “a checklist of subjective, surface level details such as references to clothing, administrative tasks, and bulletin boards,” the Chicago Public Schools said in a press release earlier this year.
Since 2010, The Hechinger Report has been taking an in-depth look at efforts to improve teacher effectiveness. What’s the best way to identify a good teacher? Should test scores be used to hire and fire teachers? How is the role of a school principal changing? Are schools improving as a result of the new efforts?
The push to change teacher evaluations has been driven largely by nonprofit groups and politicians, and it follows research demonstrating that teacher effectiveness is the most important in school-factor affecting student performance.
The Obama administration’s 4.35 billion dollar Race to the Top initiative, a competitive grant that offered states money in return for reforms, included incentives for states to adopt evaluation procedures aimed at better determining teacher effectiveness.
New teacher evaluation systems have been changed in at least 33 states since 2009, and more than two dozen states are relying on both observations and student growth on test scores to judge a teacher’s effectiveness. Many states are using “value-added” models to grade teachers, which involve complex formulas that take into account factors like a student’s past test scores and attendance to predict what his or her score will be on this year’s test. Teachers in these states will be held responsible for getting their students to meet or exceed that expected score.
Chicago will use a value-added model at the elementary level and an “expected gain model” at the high school level. The expected gain model does not take other factors like attendance or poverty into account, and only measures the percentage of a teacher’s students who meet or surpass their expected growth scores, which are based on beginning-of-year tests.
Other systems rolling out new evaluation systems have also experienced push-back. In 2010, when officials in Washington, D.C., implemented a new evaluation system, seven percent of the teaching force was fired. In Tennessee, where student test scores count for 35 percent of a teacher’s evaluation, questions have been raised about the system’s accuracy and reliability, with someteachers seeing inconsistencies between the scores they receive on observations and their value-added ratings.