Artificial intelligence, machine learning, neural nets, blockchain, ChatGPT.
What do all these new tools and technologies have in common? They run on the same fuel: data, and lots of it.
Netflix machine-learning algorithms, for example, leverage rich user data not just to recommend movies, but to decide which new films to make. Facial recognition software deploys neural nets to leverage pixel data from millions of images. A blockchain is in essence a large database, decentralized among many users. Generative AI algorithms, like those used to create ChatGPT, train on large language datasets.
Getting the data to fuel these technologies immediately leads to challenges with bias, accuracy, privacy and intellectual property rights. Since at least 2006, technology leaders and mathematicians have argued that data is the new oil. Similar to how petroleum is a key resource for physical products from fabric to shampoo, data is a vital resource for our digital lives and an increasing share of our offscreen lives as well.
In K-12 schools, students are facing an onslaught of emerging technologies — new developments arrive by the day — and yet we’re still teaching many of our core school subjects as if our daily lives are unchanged by these tools.
Since 2011, national math test scores from the National Assessment of Educational Progress, or NAEP, fell by 17 points for eighth graders and 10 points for fourth graders in data analysis, statistics and probability.
Even more concerning, our collective data literacy has actually declined over the past decade.Since 2011, national math test scores from the National Assessment of Educational Progress, or NAEP, fell by 17 points for eighth graders and 10 points for fourth graders in data analysis, statistics and probability. Pandemic effects were only a contributing factor, and the drop-offs outpaced declines in other content areas.
Achievement outcomes are also vastly disproportionate across race and income, with Black students behind white students by over 30 points in data analysis basics. For context, some researchers believe that a gap of just 10 points equates to a full school year of learning.
There are many reasons for these challenges, including a combination of outdated state standards and tests that incentivize teachers to push data-related content to the bottom of their lesson plan lists.
Predictably, this lack of prioritization surfaces in self-reported content emphases from educators nationally, which show that lesson plans dedicated to data analysis and statistics consistently get the shortest straw in mathematics and other school subjects. This isn’t the fault of teachers, but rather of the system and the systemic choices we have made to date that heavily constrain classroom time.
The result is that student achievement has moved in the opposite direction of modern technology. We need to reverse this trend, quickly.
A number of schools and states across the country have been experimenting with the best ways to create and integrate data science programs for K-12 students. Full-year mathematics courses that focus on data science are being piloted in Ohio, Virginia and Utah; career and technical education sequences for data science have been added in Arkansas and Nebraska; data science electives extend computer science foundations in Georgia; data-embedded lesson plans across school subjects and grade levels are appearing in classrooms from coast to heartland.
Students will carry these basic life skills across any career, any life situation and any form of civic participation for the long haul.
These efforts all attempt to incorporate data analysis and computational technology into core school subjects, with a focus on mathematics, science and social studies. Importantly, they complement but differ from the approach of the K-12 computer science community, which has historically focused on building a stand-alone school subject. Many of these new programs enhance what a teacher already knows and can express about their own disciplines, adding datasets and technology as a way to deepen understanding.
Despite these efforts, programs in data science at the K-12 level remain few and far between. In a recent analysis of state programs, only nine states earned an “A” or “B” grade for the teaching of data science. A majority of states received a “D” or “F.”
Our country must do better. Our primary goal in K-12 should be to create a strong foundation in data literacy for every student before they graduate high school. Students should be equipped with the ability to interpret, work with, analyze and communicate data effectively. Students will carry those basic life skills across any career, any life situation and any form of civic participation for the long haul.
The goal is not to create an army of professional data scientists straight out of high school. Rather, it is to provide students with the necessary exposure to the data basics, and spark inspiration for them to pursue a two-year, four-year or graduate degree in these fields if they choose. The coursework should be challenging but accessible — “low floor, high ceiling.” The 51 percent of students who won’t complete any college degree in the near future should still learn the basics and be inspired to explore low-cost digital training opportunities to learn technical skills and earn rewarding jobs.
Importantly, students have reported actually enjoying data science courses. A National Academy of Sciences summit recently cataloged the field’s growing diversity of curricula approaches, with a consistent theme that student engagement is off the charts.
A mathematics teacher told us that in over 20 years of teaching, she had never before had a student ask for an internship related to her course — until she taught data science.
Students stop asking “Why do I have to learn this?” and instead ask, “What’s next?” Some teachers even report students moving through material faster than anticipated.
We need to act quickly to get these opportunities to every student and to support educators with the right resources to teach data literacy and science well. Our students are counting on us to help them prepare for a future that is already here.
Zarek Drozda is the director of Data Science 4 Everyone, a national initiative based at the University of Chicago.
This story about teaching data science was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Hechinger’s newsletter.