Gerald Carlson’s heart sank when he received word several years ago that a controversial statistical analysis had decreed his program one of Louisiana’s weakest in preparing educators to teach English language arts.
“We thought we had a good program,” said Carlson, dean of the education school at the University of Louisiana at Lafayette. “We were shocked when we saw the results.” Carlson has since worked with his staff to revamp the school’s curriculum: adding a new class in reading and English language arts, requiring more writing across the board, and beefing up professional development for the university’s professors.
He is confident that when the latest results come out this spring—the first to be released in over a year and a half—the university will fare well.
Scores of teacher training programs across the country will likely face similar scrutiny in coming years. Following Louisiana’s lead, policy makers in a growing number of states are evaluating programs based on the test scores of their graduates’ students. So far, eight states have policies requiring them to do a similar analysis, most of them adopted in the last few years, according to the National Council on Teacher Quality.
“This is a policy movement that’s sweeping the country,” said Charles Peck, a professor of special education and teacher education at the University of Washington’s College of Education.
Related efforts to evaluate individual teachers based on student test scores have sparked a flurry of publicity—and led to a major lawsuit. But those targeted at preparation programs (which include longstanding university-based education schools and less traditional programs like Teach For America) have gone comparatively unnoticed and unscrutinized.
As other states follow a similar path, Louisiana’s experience speaks to the promise and peril of the new approach. Some programs, like Carlson’s, have used the data to make significant changes. “We should be doing this analysis anyway, whether we’re forced to or not,” said Carlson.
But the data only captures institutions with a large enough cohort of graduates (25 in a given subject area) for the results to be statistically valid. As a result, not a single New Orleans-based teacher training program — including the University of New Orleans, Xavier University, Southern University at New Orleans, and Our Lady of Holy Cross— is consistently included in the study. (In recent years, only UNO has had a large enough cohort, but just in its undergraduate social studies and math certification programs.) Those institutions still go through traditional accreditation processes. But in a state lauded for holding its education schools accountable, the quality of many local institutions still remains comparatively unknown.
Others worry that even when the cohort size is large enough the data can be over-simplistic and misleading at times. Do low reading scores recorded years after a group of teachers enters the classroom, for instance, mean their training program had a bad curriculum or weak instructors? Or did it admit weaker candidates from the start, or perhaps send them off to schools with less supportive principals?
“It’s kind of like having a fire alarm go off in your house, but not knowing where the fire is,” says Peck.
Digging deeper into the data
The concern voiced by Peck and others prompted state officials to break down the data in a more detailed way for the campuses, said Jeanne Burns, associate commissioner for teacher and leadership initiatives for Louisiana’s Board of Regents. Over time, training programs could see how they fared with low-income students, for instance, or which content area skills their graduates struggled to teach.
Reporter Sarah Carr was interviewed on this story by a New Orleans radio station. Listen to the interview
“Our goal has always been to use the data to help our campuses improve,” she said.
The Louisiana studies have tracked student performance in grades four through nine starting in 2003, tying it back to their teachers’ preparation programs. By using what’s known as a “value-added” analysis, researchers homed in on the amount of growth seen in individual students, no matter their starting point. They then compared the overall student growth in the classrooms of recent graduates of different training programs to the growth produced by veteran educators. No new results have been issued in over a year because state officials have been working to change the system to align it more closely to the state’s comparatively new evaluation process for individual teachers.
Louisiana officials requested the analysis more than a decade ago partly with the hope that it would weed out the state’s weakest teacher training programs. If programs consistently rate low, the state can shut them down. But so far, no program has been forced to close directly as a result of the new system.
“I don’t view it so much as an accountability mechanism because none have been shut down,” said Tim Daly, president of the national organization TNTP (formerly The New Teacher Project), which trains many of New Orleans’ Teach For America instructors, and whose organization has fared fairly well in the studies. But he says the analysis has improved transparency, including for prospective teachers shopping for a training program.
It has also prompted self-examination and improvements at a few programs, like the University of Louisiana at Lafayette and the Louisiana Resource Center for Educators, a private teacher certification program based in Baton Rouge.
The initial round of results showed that the Resource Center was “doing a lousy job of teaching reading,” said Nancy Roberts, the executive director. The initial data did not, however, provide any information about which of the center’s graduates posted the weakest results, which part of reading students had failed to master (phonemic awareness? overall comprehension? something else?), or which grade levels struggled most.
“It was like shooting darts in the dark,” said Roberts. “But we decided to roll up our sleeves.”
After looking at the more detailed data, the Resource Center began to tailor its literacy instruction more precisely based on the age group the graduate would be teaching. It also increased the amount of overall class time spent on reading and literacy from less than 20 percent to about a third.
George Noell, the Louisiana State University researcher who designed the evaluation system, said he is pleased that some of the lowest performing programs are making improvements, although “across the whole spectrum, it’s not as clear the data have helped people as much as I hoped.”
“I assumed it would not be as long a journey from seeing results to figuring out what to do with them,” he said. “But I consider the fact that we are even talking about it huge progress.”
The selectivity factor
Some experts say that determining how much of a young teacher’s success or failure can be tied to his or her training program is like asking which came first, the chicken or the egg.
If a program’s graduates post weak results, “it could be that they are doing a fabulous job but they are not selective enough in terms of who is admitted,” said Kate Walsh, president of the National Council on Teacher Quality. She pointed out that many of the programs that have performed best in Tennessee and Louisiana’s evaluations also have highly competitive admissions.
Daly disagreed that the selectivity of a program significantly affects its graduates’ performance in the classroom. “While selectivity may play a small role, what we do to train teachers plays a big role,” he said.
Katrina Miller, the director of Tennessee’s federal First to the Top grant, said she considers selectivity something training programs can control. “You are choosing who to admit,” she said. “Selectivity, since it’s chosen by the program, is part of the training.”
Some programs are far more oversubscribed than others, however. And it’s easier for Teach For America—where more than 10 percent of all Ivy League seniors apply—to be highly selective than your average state school. That said, the teacher program at the University of Louisiana at Lafayette raised its admissions standards modestly, in addition to revamping part of the curriculum, said Carlson. The university now requires a 2.5 cumulative GPA (which includes all of a candidate’s past grades) instead of a 2.5 adjusted GPA (which only counts the higher of two grades if a student had to repeat a course). “That’s eliminated a lot of people,” Carlson said.
Brian Beabout, a former New Orleans public school teacher who now works as an assistant professor at the University of New Orleans, said he worries that a teacher’s success in his first two years on the job says more about the quality of his K-12 education, which spans 13 years, than the quality of his preparation program, which can be as short as six weeks. And as time passes, teachers sink or swim largely based on how much support they get on the job, he says.
“One of the dangers of single rankings is that we let school districts off the hook for providing career-long support and development,” he said.
Missing the small programs
Perhaps the greatest limitation of Louisiana’s data-centered analysis is that it fails to encompass most of the state’s small teacher training programs. All told, at least seven of the state’s programs did not have large enough numbers for any results to be reported in 2010-2011. Fourteen programs had at least some results released.
In New Orleans, many of the programs were already small before the dual blows of Katrina and budget struggles further reduced their size. In 2000, 172 students completed the University of New Orleans’ teacher certification programs, compared to 104 in 2009. The drop off was even more dramatic at Southern University of New Orleans: 87 graduates in 2000 compared to 12 in 2009. UNO’s young graduates have performed fairly well in math and social studies (particularly math); those were the only two areas where the cohort size is large enough to be captured in the study.
Burns said even New Orleans’ smaller programs, like SUNO, will eventually be captured in the value-added analysis because the state will wrap together multiple years of results from different institutions. She said the smaller institutions have been held accountable through national accreditation processes and a requirement that all teacher training programs get reapproved by the state over the last decade.
Few advocate for including programs so small that their value-added results aren’t statistically meaningful. But some experts say Louisiana should develop a more qualitative component to its evaluation that can capture even small programs—and provide a more multi-dimensional look at larger ones.
Larger institutions can be left with a false sense of superiority if they post high value-added scores in states where the overall quality of teacher preparation programs is weak, said Walsh.
“We need multiple measures to see if a program is effective,” she said. “Our view is that performance by these institutions is not good across the board and even the highest performing institution is not something you want to hold up as a model.”