Get important education news and analysis delivered straight to your inbox
How do you test students during remote learning? I’ve heard about problems ranging from widespread cheating to technological glitches. So a recent study caught my attention because it may have landed upon a clever pandemic workaround that could also change the way many college professors administer exams even when we return to in-person learning.
First, I have to tell you about an unusual type of test called a two-stage exam. In a two-stage exam, students take the same exam twice in a row. The first time, students take it individually in the lecture hall, the traditional way. Then, working in small groups of three to five students, they answer the same questions again. Proponents say the student discussions (and animated arguments!) help everyone fix misunderstandings and errors and remember the material better. Skeptics say it’s difficult to schedule and wonder whether everyone is really learning enough during the group stage to justify devoting the extra class time to testing.
Two-stage exams sometimes go by other names — such as pyramid exams or collaborative testing — but whatever the name, the approach has slowly been gaining popularity in college science classes over the past decade. Both Harvard University and the University of California Merced happened to be administering two-stage midterms in their introductory physics classes when the pandemic hit during the spring of 2020. (The UC instructor, Kristina Callaghan, was a graduate student of Louis Deslauriers at Harvard and carried the two-step practice with her to California.) The first midterm exam of the semester had already been completed in person at both institutions. The second midterm was still to come. Callaghan and Deslauriers agreed to try the two-stage format online.
Students, working from their homes, took the individual test directly on the computer on a specified date and time — synchronously — for an hour and a half. Then, in groups of three to four students, they had 24 hours to retake the test over a video app like Zoom, FaceTime or SnapChat. Some even completed the group exam during an old-fashioned telephone conference call. Students scheduled the collaborative stage at their own convenience — in effect, retaking the test asynchronously.
The instructors documented what happened and compared the outcomes of in-person and remote group testing. Students gave similar positive ratings to both formats when asked about their levels of engagement, collaboration and feedback. More importantly, students remembered the material long after the online version of the two-step exam was over. For example, students at Harvard scored 10 percentage points higher — 94 percent, on average — on a post-test three weeks after the midterm. That kind of improvement and long-term recall is consistent with previous studies on the benefits of two-step exams.
“There are no more obstacles to doing two-stage exams,” said Deslauriers, one of the authors of the unpublished working paper documenting the results. “You don’t have to cut your midterm in half to make time for the group exam, or you don’t have to annoy your students by asking them to come at night. And don’t worry, students are going to enjoy it pretty much as they did in person.”
In the future, when in-person school resumes, Deslauriers said he would administer the individual part in person in the lecture hall, as before, but he might allow students to complete the group exam asynchronously on their own time.
During the remote-testing experiment, Deslauriers learned that students preferred completing the group portion of the exam without time pressure. “The main point is for you to get feedback on what you just did, so that you can learn a lot,” he said. “So if you need two hours, take two hours. If you need an hour, take an hour. Students all said without exception, even those who did it quickly, they all said they really appreciated the fact that they weren’t stressed.”
In other words, the untimed group exam improved the quality of the feedback that students received from each other. Deslauriers believes that the instant feedback that students are getting during the second stage is what makes the two-step exam so powerful. By reviewing the questions with their peers, they learn what they’ve gotten right and wrong and fix the wrong stuff.
There’s also the added excitement of having to arrive at a consensus answer with peers. Deslauriers told me that he has seen otherwise quiet students getting into animated arguments, defending their approach to a problem. That combination of high student engagement with instant feedback seems to be particularly beneficial for learning.
I wondered about cheating and if students looked up answers online together. But Deslauriers said he didn’t detect cheating during the group portion of the exam at either campus. Students agree to honor codes at both schools and, perhaps, in the group setting, peers served as potential witnesses and discouraged bad behavior.
Deslauriers says there are far fewer group dynamic problems during collaborative exams than there are during typical classroom group work. “When you do active learning in the classroom, which I have a decade of experience doing, you can see there are free riders, people who just don’t contribute, and then you have to try to intervene and manage the groups,” he said. “It’s a big part [of teaching] to manage the groups to make sure they’re productive. Guess what, when you do the group exam, you don’t have to intervene with a single group. Does it still happen? I’m sure it does. But I would say 20 percent of what it was.”
Deslauriers thinks the high stakes motivate students to participate. The group stage is worth 20 percent of the midterm exam grade in his classes. Once, when he set the stakes too low, students didn’t care or work hard enough on it. But he says stakes that are too high also stifle collaboration. “It’s important to get that right,” Deslauriers said. “It’s possible with a different student population that you might want to do 40 percent.”
Designing an exam at just the right level of difficulty, neither too hard nor too easy, is important too. When teachers make the quiz too easy, he says, there’s nothing for the kids to discuss. “It completely kills the productiveness of the collaboration,” he said. “I’ve seen [two-stage exams] done wrong very often, but when it’s done right, it’s active learning on steroids.”
This was a small experiment involving 330 students at two colleges and needs to be replicated. But if you are wondering how remote learning, with all of its frustrating disappointments, might change education, the answer could be a bunch of small technological innovations that teachers figured out on the fly.
This story about two-stage exams was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Hechinger newsletter.
At The Hechinger Report, we publish thoughtful letters from readers that contribute to the ongoing discussion about the education topics we cover. Please read our guidelines for more information. We will not consider letters that do not contain a full name and valid email address. You may submit news tips or ideas here without a full name, but not letters.
By submitting your name, you grant us permission to publish it with your letter. We will never publish your email address. You must fill out all fields to submit a letter.