The Hechinger Report is a national nonprofit newsroom that reports on one topic: education. Sign up for our weekly newsletters to get stories like this delivered directly to your inbox.

Ken Koedinger, project leader of LearnSphere, and a professor of human computer interaction and psychology at Carnegie Mellon University (Photo: Carnegie Mellon University website)

LearnSphere, a new $5 million federally-funded project at Carnegie Mellon University, aims to become “the biggest open repository of education data” in the world, according to the project leader, Ken Koedinger.

If you think that sounds ambitious and a lot like inBloom, the Gates Foundation-funded non-profit that shut its doors in 2014 after student privacy fears escalated, you’re right.

“There certainly are some similarities,” said Koedinger, a professor of human computer interaction and psychology at Carnegie Mellon University.

An internationally renowned leader in the field of education technology, Koedinger’s known for developing the mathematical models that drive “cognitive tutors,” which tailor instruction to individual students. He’s also a co-founder of Carnegie Learning, an educational software business that was spun off from Carnegie Mellon in 1998, and whose software is currently used by 400,000 students.

Koedinger launched LearnSphere earlier this year with the hope of making it easier and faster for researchers to analyze big datasets — mostly student keyboard clicks —  in order to test educational theories and boost learning outcomes from elementary school to college. Just as inBloom had hoped software makers and researchers would use its vast database to improve education technology, Koedinger also wants to create a forum for sharing and analyzing data on how students learn. But, he says, there are important differences between LearnSphere and inBloom.

For one, he says he’s not going to allow any personal information from school records in LearnSphere.

“In some ways, it’s a deep philosophical difference,” Koedinger said. “We are not looking that much at collecting demographic data and certainly not any kind of record information. Those are the things that tend to be particularly sensitive.”

No student names, no addresses, no zip codes, no social security numbers, he says. No race, family income or special education designations. “The student identifier column, even if yours is already anonymized, we re-anonymize it automatically,” he added.

There may be demographic information on a school — for example, the percentage of students who qualify for free or reduced-price lunches. But Koedinger says that even the school name is anonymized in most cases.

Unlike inBloom, which wanted public school districts to use its servers to store student information, Koedinger has no plans to store school records and doesn’t anticipate that school officials will upload anything to his virtual warehouse of data. Instead, he wants education researchers and software developers to upload their data. This is the data of keyboard clicks as students are using educational software, the millions of keystrokes they make as they answer questions, hit backspace or sit idly daydreaming and uninterested.

This new university-driven data repository builds off of earlier data projects at Carnegie Mellon, Stanford and the Massachusetts Institute of Technology, all of which are partners in LearnSphere. The University of Memphis has joined as a team member, too.

Koedinger’s team isn’t building a physical warehouse in one single location. Those who want to share data can upload it to one of the sites that LearnSphere is managing, or they can keep it on their own server and control who gets access to it. The goal is to build something called a “distributed infrastructure,” which allows researchers access to data on someone else’s computer. The hard work for Koedinger’s team is in cleaning up the data so that outside researchers can analyze it easily.

Regardless of where the data is stored, Koedinger says his research manager will go through the data with a checklist to make sure no information that could identify a student is attached to data that is being shared. And, he says, this manager will continue to monitor data for improper additions.

The ultimate goal is to translate research questions into computer commands that can be run on any dataset. For example, how many times does a student need to repeat or practice something before it becomes knowledge? Or when is the optimal time to give feedback, right away or after a bit?

At the moment, Koedinger is working on creating an example of the kind of research project he would like to see housed by LearnSphere. He recently studied how much students learned when they were taking a free online course, a MOOC, in introductory psychology. He asked what increased student learning the most:  videos, reading assignments or online interactive tasks?

“Most instructors are spending their time on videos. But our model suggests, for every activity you do, you get six times the bump than for every video you watch,” said Koedinger. “Maybe someone will say, ‘I don’t believe it for my course, I think the videos are more valuable.’ Let’s see for yourself with your own data and see what you get.

Koedinger hopes that with a simple press of a button, researchers can rerun that same question on a different course without spending months collecting and cleaning the data. He supposes it’ll take a “year or so” before that’s a reality.

This article also appeared here.

The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn't mean it's free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.

Join us today.

Letters to the Editor

At The Hechinger Report, we publish thoughtful letters from readers that contribute to the ongoing discussion about the education topics we cover. Please read our guidelines for more information. We will not consider letters that do not contain a full name and valid email address. You may submit news tips or ideas here without a full name, but not letters.

By submitting your name, you grant us permission to publish it with your letter. We will never publish your email address. You must fill out all fields to submit a letter.

Your email address will not be published. Required fields are marked *