Three rules for robograding, and other ed-tech innovations

Automated grading of students’ essays and short answers has been around for almost 20 years, but it’s engendering quite a bit of controversy as the technology progresses. A large study released in April, connected to a competition sponsored by the Hewlett Foundation, showed the grades assigned by a computer program to 22,000 7th, 8th and 10th grade essays matched very closely with the grades assigned by trained teachers to those same essays. The results have been watched eagerly, as the phase-in of Common Core state standards brings with it new assessments that require students to write more, at more length. But states aren’t necessarily getting more money to pay for grading the tests.

Money is not the only issue. Any teacher will tell you that grading papers is not generally the most beloved part of the job, and the time that it takes means that students often must wait days, weeks, or months to get feedback on their written work, which is not ideal for learning.

graphic by Vic Paruchuri

Automated essay scoring (AES) engines do not read essays. They use “machine learning” algorithms, trained on numerous examples, to match general characteristics of essays (such as spelling, sentence length, subject-verb agreement, or the use of particular words and phrases in response to a prompt) to similar essays assigned a given grade by humans. Their results are available instantly.

Recently, Vic Paruchuri, one of the creators of a winning AES system who currently works on automated essay grading for the MOOC platform edX, wrote about the lessons learned from automated grading so far.

Although he is an interested party, he makes some great points about the power and limitations of these technologies. What he says can also apply to many other areas of ed tech.

“The goal is to maximize student learning and limited teacher resources (time) in a way that is flexible, and under the control of the subject expert (teacher),” he writes.

1) Use it Transparently: Paruchuri argues the code should be open-sourced to allow all interested parties to understand, as far as is possible given their technical nature, both these technologies’ potential and its limitations. “The less we tell people about how things are done, the more valuable and important we become,” he says with tongue in cheek, a statement that could apply to many of the ed-tech “solutionists” out there.

Instead of a proprietary black box that hands down a solution from on high, publishing the code, its documentation, and error rates gives teachers and students insight into the process. Careful design by use for the non-expert can even allow teachers to contribute fixes to make the programs more accurate.

2) Use it in combination: As illustrated in the image below, Paruchuri suggests that the best way to use a tool like automated essay scoring is in combination with other forms of alternative assessment, like peer assessment and self assessment. The teacher and student should always be in the driver’s seat.

graphic by Vic Paruchuri
3) Use it flexibly:

Automated essay grading can be used in outside-the-box ways. A student could revise and resubmit an essay ten or a dozen times, getting instant feedback, for example. In this way, the scoring engine would be like using spellcheck, formatting guides, word count, or other word-processing tools. Teachers should be able to choose not to use automated essay grading for certain students and certain assignments. Students and teachers should be able to tweak the rubric and see how emphasizing certain points in an assignment relates to certain results in student writing.

One great assignment would be for students to follow the work of Les Perelman, a retired writing professor and longtime critic of automated grading, who excels in producing nonsensical essays that fool the algorithms. “In today’s society, college is ambiguous. We need it to live, but we also need it to love. Moreover, without college most of the world’s learning would be egregious,” read one of his essays in part that earned a perfect score from the computer.

Spotting the logical holes in writing that only appears to be erudite, not to mention the “compositions” of spambots, is a key task of the modern world. An advanced computing assignment would be to produce a robo-essay writer that can write essays that the robo-grader gives perfect grades.

None of these tactics necessarily fit the demands of standardized testing, but they do serve the larger cause of teaching and learning. Automated grading has the potential to serve the collaboration between teachers and students in the lifelong task of improving writing.