In New York City, there’s a big debate over who should gain admittance to eight elite public high schools, including the well-known Stuyvesant High School and the Bronx High School of Science. Currently, Asian-American students score high enough on an entry exam to win a considerable majority of the seats. Mayor Bill de Blasio and a new school chancellor want to bring in more black and Latino students, who make up most of the city’s school population.
This tension between demographics and academic excellence is prompting scholars to take a closer look at the data on scores and grades and how well the entry exam predicts achievement. But one researcher thinks the most consistent bias might be against gender.
“They’re trying to identify the very top students but every way you slice the data, it shows bias against girls,” said Jonathan Taylor, a research analyst at the Gender Equity Project at Hunter College of the City University of New York. “I’m not saying that the exams are racist or sexist. But we’re talking about statistical bias. The argument I’m making is that the test, as a sole criteria — it’s insufficient.”
Taylor’s data analysis has implications beyond high school admissions in New York City. It adds to a growing body of evidence that high-stakes testing based on multiple-choice questions might be one reason for the small pool of women reaching the upper echelons of math and science professions.
More on that after I explain Taylor’s study, “Fairness to Gifted Girls: Admissions to NYC’s Elite Public High Schools,” which has completed the peer-review process and is currently set for publication at the Journal of Women and Minorities in Science and Engineering. It is expected to be published in a forthcoming 2019 issue and I was given an advance copy.
Taylor didn’t deconstruct the text or math content of the questions on the Specialized High School Admissions Test (SHSAT), which is the sole criterion for getting into one of the eight selective high schools. The “bias” here is that among students who posted identical scores on the test, girls tended to have higher grades than boys, meaning that the test is consistently underestimating the achievement of girls.
Among all test takers with the same score – regardless of whether they were admitted to a specialized high school – the subsequent ninth-grade grades of girls tended to be 4.2 points higher on a 100-point scale than those of their male peers. For example, among kids who the test indicated were of the same academic aptitude, girls might have an 89 grade-point average (GPA) versus an 85 GPA for boys. At Stuyvesant, the most competitive school to get into, the boys’ GPAs were 3.55 points lower than the girls’ GPAs among students who had identical test scores. On the flip side, among girls and boys who earned the exact same grades at Stuyvesant, girls’ entry exam scores were 6.6 points lower.
It was the same pattern at the other seven selective high schools: boys arrived with higher scores; girls earned higher grades.
Taylor said the test’s bias against girls applied to Asian-American girls, too, explaining that lower achieving boys were taking slots that Asian-American girls might otherwise have earned.
Taylor began with data for all 28,000 eighth-graders who took the SHSAT in 2013. (Roughly one-third of the city’s 81,000 eighth-graders chose to take this optional exam.) Then, he compared scores for each student with their eventual ninth-grade grades. Some of these students ended up at one of the elite high schools but most didn’t.
Overall, the correlation was a loose one. Test scores predicted only 20 percent of the variation in students’ GPAs. In other words, students with the same test high scores had wildly different GPAs at school the following year. At first glance, the test doesn’t seem very good at discerning A students from B students. Seventh-grade GPAs were twice as likely to predict ninth-grade achievement as test scores.
“People say the SHSAT is objective and that grades are unreliable,” Taylor said. “Schools and teachers have different subjective grading standards and grades are all over the place. The exams were designed to be a uniform metric. It’s ironic that the exams don’t predict as well as grades.”
One might wonder if girls could be taking easier classes or not as many math and science classes once they get to high school, and perhaps that is why girls are getting higher grades. But Taylor checked and he found that girls were, in fact, well-represented in math and science classes in ninth grade and doing very well in them.
Indeed, the most startling finding is that girls were over-represented among the very top A-plus students in science and math classes at the specialized high schools, but were underrepresented among the very highest scorers on the exam. Specifically, girls accounted for only 40 percent of the top 3 percent of exam scores but they account for half of all the 95s and above in ninth-grade math and science classes, such as geometry, algebra, biology and physics. That’s all the more surprising because girls account for only about 40 percent of all the students in these elite schools, yet they’re earning half of the highest grades.
Taylor’s findings here shed some light on a bigger debate about women in math and science. Back in 2005, Lawrence Summers, then president of Harvard, controversially suggested that the reason there were fewer women in science, technology, engineering and mathematics fields, often abbreviated as STEM, was because there is a smaller pool of highly talented women at the very, very top and that was perhaps related to women’s “intrinsic aptitude.” By contrast, this study of exceptional students in New York City didn’t find evidence of a smaller pool of super-bright girls in STEM. Even in STEM classes at Stuyvesant, Brooklyn Tech and Bronx Science, girls were more likely than boys to post the highest A-plus grades.
This research echoes a large body of research on the SAT, the college admissions test, which has similarly found that boys outscore girls on the SAT but girls earn higher grades in college. “It’s precisely the same problem,” said Taylor. For example, a 1992 study in the Harvard Educational Review found that among 47,000 college students in 51 colleges, women who earned the same grade as men in the same math course had lower SAT math scores.
College professors are also noticing versions of this testing problem in their large lecture classes. University of Michigan physicist Timothy McKay found that even the brightest straight-A women underperformed male peers in his physics class that was based on two exam grades, but women outperformed men in lab sections where there was no testing pressure.
Why women are doing worse on high-stakes tests is a matter of conjecture. McKay speculated that something called “stereotype threat” is at play: even the brightest women may not perform at their best when they feel that they are in an stressful environment where women don’t traditionally succeed.
Taylor subscribes to a “risk aversion” theory that women prefer not to guess when they aren’t confident of an answer. Boys’ greater willingness to guess might be just enough to juice their test scores. Taylor is hoping to get access to SHSAT answer sheets to see if he can detect gender differences in unanswered questions.
This story about women in STEM was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Hechinger newsletter.