For three years, Kimberly Safran has been on the front lines of the Pittsburgh school district’s push to hold teachers more accountable for student achievement. As co-principal at Brashear High School, a large, economically and racially diverse school on the city’s South Side, she spends most of her time watching teachers in action, taking notes and pushing them to change their ways.
As the rest of the state will soon find out when it launches evaluations modeled on Pittsburgh’s system this fall, hers is not an easy task.
On a recent Wednesday in April, she sat down with veteran biology teacher Vince Vernacchio. The day before, Safran had observed him—jotting notes all the while—as he guided his students in creating a poster meant to demonstrate their knowledge of the nervous system.
She began by asking Vernacchio how he thought it went.
“I was really comfortable with yesterday’s lesson,” he responded, recounting how he had begun by asking several open-ended questions. “To me I thought those were strengths.”
“I agree with you that the question itself…was an open-ended, high-level question,” Safran said. “I’m thinking though of the discussion at the beginning of the class between you and the students. I would put that more as basic.”
She referred to a description of questioning techniques on the checklist the district uses to rate teachers: “The teacher sometimes attempts to ask some questions designed to engage students in thinking, but only a few students were involved.”
Vernacchio disagreed. The tension increased as the two dissected the rest of the lesson.
“Do they really know what participation sounds like and what it looks like?” Safran asked.
“They’re 18 years old. They know when they’re participating,” Vernacchio said.
“How they know what’s expected of them?”
“We spell it out…I can’t dig into their brains.”
By the end, both were still smiling, but barely.
This fall, under a 2012 state law that requires tougher standards for teacher evaluations, Pennsylvania teachers will be under scrutiny like never before. Using a 104-page guide designed by teaching expert Charlotte Danielson, a former teacher turned consultant, school administrators will analyze how teachers plan lessons and present them, how they interact with and question their students, and even how they communicate with parents.
In total, the guide—which the state requires districts to adopt—outlines 24 elements of teaching with descriptions of what “unsatisfactory,” “basic,” “proficient,” and “distinguished” teaching looks like. A teacher’s score on the observations will make up 50 percent of her overall rating.
The other half of the rating will combine measures of student achievement, including growth on standardized tests, scores for the building where a teacher works, and other elements like district-level assessments or student surveys.
Teachers who receive an overall rating of unsatisfactory twice in 10 years can be fired.
Pennsylvania is among 30 states that are in the process of overhauling how teachers are evaluated. Previously, the vast majority of teachers received high ratings on evaluations, regardless of whether their students succeeded or failed academically. The new system is supposed to change that. Proponents believe linking teacher job security to student performance will cull the worst teachers and help others become better at their jobs, thus reversing years of lagging student achievement.
But in many ways, the new evaluations involve a leap of faith.
Research suggests that teachers who do well in classroom observations tend to have students who perform well on tests, but there is no definitive evidence yet suggesting that more intensive evaluations actually improve student achievement. Pennsylvania educators have major questions about whether the system will be both fair and rigorous, and how schools will balance the demands of the new evaluations with other challenges like budget cuts and new standards.
And unlike Pittsburgh, districts elsewhere in the state, including Philadelphia’s, have had much less time—and fewer resources—to get ready.
“If we were starting the…process now and had to implement in August, oh my word,” said Pittsburgh superintendent Linda Lane. “My worry, my concerns, money is certainly one of them, but time is the other one.”
Over the course of four years, Pittsburgh officials have held intensive negotiations with the local teachers union to create a system everyone agrees on. They’ve held multiple forums for teachers, decided their software system is defective and should be replaced, and identified other kinks that need to be fixed.
They also won $40 million from the Bill & Melinda Gates Foundation in a national grant competition to spend on software and consultants, hiring staff, and raising teacher salaries. (The Gates Foundation is one of the many funders of The Hechinger Report.) Even so, Pittsburgh school officials wish they had more time to develop some aspects, and say that even with the extra funds, an ongoing budget crisis has hurt their progress.
This year and last, the state budgeted about $6 million per year in state and federal funds to cover software, consultant fees, and trainings for the rest of the districts in the state. The new evaluations could mean a windfall for consultants. Among the companies that have benefited is Teachscape, a partner of Charlotte Danielson’s, which received a state contract for $259,000 to provide online evaluation system; Pittsburgh spent more than two thirds of the first half of its Gates grant on consultants, according to the Pittsburgh Tribune Review.
About 200 Pennsylvania districts began trying out the evaluations two years ago in a state pilot project. This year, more districts, including Philadelphia, joined the trial period. In total, about a quarter of the state’s teaching force participated this year.
But other districts chose not to participate, including several suburbs around Philadelphia, and will have no time to test the evaluations before they are launched next year with high stakes for teachers and principals.
Under the law, districts will have to launch the evaluations in the fall—ready or not. State officials worry some schools may comply with little preparation or enthusiasm, making it less likely the system will actually produce improvements in teaching.
“Philadelphia has been slow to start. We have been doing massive intervention this year,” said Carolyn Dumaresq, Pennsylvania’s deputy secretary of Elementary and Secondary Education, who has overseen the evaluation reforms. “There’s a steeper ramp there than elsewhere.”
Jerry Jordan, president of the Philadelphia Federation of Teachers, predicts that many teachers will be confused because they have been given little information about the new system or how they will be evaluated.
Officials in Philadelphia say they will be using an $11 million grant from the Obama administration’s Race to the Top grant program to help launch the new evaluations. But the district faces a $304 million budget shortfall and as many as 3,500 layoffs this fall—including nearly every assistant principal—raising questions about whether school principals will be able to handle the time-intensive evaluation process.
“It’s definitely going to cut into lots of things,” Jordan said. “Here the resources are lacking.”
But Karen Kolsky, an assistant superintendent for the Philadelphia district, said the city’s schools were already using an observation system based on Charlotte Danielson’s guidelines, and that the new system is “not that drastically different.”
The major difference between the old observations and the new is that in the new model, “the teacher does most of the talking,” Kolsky said.
In smaller districts like William Penn, in Delaware County, officials are enthusiastic but say the new system has consumed a lot of time and resources. There, director of schools Jane Harbert says it took them six months to train all of their administrators—training paid for with federal funds that could run out next year—before they were ready to try out the observations in classrooms.
The new observation guides are much more complicated than previous versions, but educators say what takes time is teaching principals how to have productive conversations with teachers that lead to improvements in their teaching, not acrimony.
The state’s training “was useful, but it was not enough,” Harbert said. “I don’t know if we would have felt as comfortable if we said do that and go forth.”
Pittsburgh’s experience so far demonstrates the many challenges that districts both large and small will face.
In April, about 100 of Pittsburgh’s most informed teachers gathered in a conference room at a local middle school. For four years, the group has helped district officials shape the classroom observations, voted on every decision and scrapped ideas they didn’t like. The next step—adding student test scores and surveys into teacher ratings—had several confused and upset. Their jobs are on the line based on numbers they aren’t sure they trust.
One teacher asked for more details about a complex algorithm the state will use to measure a teacher’s effect on student test score growth known as value-added measurement. “Most of the teachers at my school don’t know how [value-added measures] are calculated. Can you explain it to me?” he asked.
“How is it fair that special education teachers are rated based on the whole district?” asked another about student surveys Pittsburgh will use to help rate teacher performance. “If we’re all about equity, it has nothing to do with equity.”
“Numerically, it’s not seeming fair,” said another.
Starting next year, the state will begin collecting student test score data and other measures of performance that will be incorporated into the ratings in three years. (The state is using multiple years of data to increase the accuracy of the ratings.)
The use of standardized tests in particular has been a bone of contention around the country: In Chicago, teachers went on strike last year to protest how much the exams counted in their evaluations. But educators warn that the biggest problem will likely be the measures used to rate the majority of teachers who do not teach in grades or levels tested by standardized exams.
In Florida, officials have simply developed new standardized tests that include all subjects and grades. In contrast, Pennsylvania is following the lead of states like Rhode Island that are using what are known as “student learning objectives,” in which teachers of subjects like art and gym set academic goals for their students, relying on local district tests, curriculum exams or projects and tests created by the teacher.
The use of learning objectives has raised questions about fairness and rigor, however, and a 2012 report by the Center for American Progress, a liberal think tank, found them to be “mixed and messy.”
Convincing teachers to believe in the system—especially those who will be disappointed not to receive ratings as high as they think they deserve—has been one of the biggest challenges, officials in Pittsburgh say. Without teacher buy-in, they argue, the system can’t work.
Friction between evaluators and teachers isn’t unusual in Pittsburgh, even after years of getting used to the new system. “Some older teachers are very guarded,” said Safran, who was a teacher before she went through a training program and became a principal three years ago.
For his part, Vernacchio, a 22-year veteran, says he mostly appreciates the new evaluations.
“The tool itself, it has changed everybody. It’s provided a greater focus,” he said. “But I think sometimes the way it’s used has been inconsistent…One person is being evaluated one way, and another is being evaluated another way.”
Officials, principals and teachers say tensions have eased over the years, however, thanks in part to the role the union has played in developing the system. The district has also allowed flexibility. For example, some teachers take a yearlong break from the formal evaluations to study a specific skill they want to improve, which allows the teachers more autonomy and frees up time for principals.
Jerri Lynn Lippert, Pittsburgh’s chief academic officer, estimates that more than two thirds of teachers in the district are on board with the classroom observations, a number she says has risen significantly over four years.
Kathryn Carroll is a 24-year veteran kindergarten teacher at Faison K-5 in the Homewood neighborhood of Pittsburgh. Under the old evaluations, she said, “it didn’t seem like you always knew what they were looking for.” Although she is still skeptical about whether test scores can accurately rate her abilities as a teacher, she says the new observations have “helped me get better.”
One reason teachers and their unions elsewhere may be embracing the new evaluations is that the results so far haven’t resulted in the mass teacher firings that many feared—an outcome that has led to criticisms given how much time and money is being invested in the new systems.
Under the old system, fewer than 1 percent of teachers annually in Pittsburgh received the unsatisfactory ratings that could result in firing—similar to the numbers across the state. Over the last four years in Pittsburgh, about 150 teachers have resigned or been dismissed because of the new evaluations, the most in the history of the district and about 7 percent of the teaching force total.
But in other states, early adopters have had less drastic results. In New Haven, Conn. and Florida, for example, only about 2 to 3 percent of teachers have received the lowest ratings that could lead to a pink slip. In Pennsylvania, a study commissioned by the state of four districts that are already using the new evaluations found that 96 percent of teachers received ratings of proficient or distinguished. Only 1 percent received the unsatisfactory ratings that could lead to removal—the same percentage as under the old system.
It’s unclear if the results suggest that teachers are better than many critics have claimed, if the new systems are less rigorous than expected, or if the numbers are the result of growing pains as schools adjust.
“Are the majority of the teachers satisfactory and acceptable? I think the answer yes,” said William Mathis, managing director of the National Education Policy Center at University of Colorado Boulder, which has been critical of test-based teacher evaluation. “Where I think it’s a waste of money is they’re trying to get a degree of precision that they cannot get with the measures they’ve got.”
Proponents note that at least under the new evaluations, teachers are more differentiated: Excellent teachers can now be rewarded with distinguished ratings, rather than being grouped with those who are average, for example.
Officials in Pennsylvania and elsewhere have defended the sunny results.
“If my goal had been to develop a system where … 10 percent of the teachers would have to be failing, and 20 percent would need improvement, I’ve developed the wrong system,” said Dumaresq, the Pennsylvania deputy education secretary. “My goal is to improve teachers in Pennsylvania, and to recognize the ones that are already good, because we have a lot of good teachers in Pennsylvania.”