Commentary: Data mining can improve our schools. Will Congress allow it?

Bills introduced in the House and Senate could pave the way for more innovative uses of education data – but Congress needs to take a more enlightened approach to overly restrictive provisions.

New York state put out request for proposals for a new computer-based testing system starting around 2017.

Students in the United States perform worse in mathematics and just average in science and reading compared to much of the developed world, even though the United States spends more per student than all but four other countries. There is an obvious need to improve how the United States approaches education, but policymakers’ efforts to tap data from the classroom to improve educational outcomes have thus far been tepid at best.

In the past, as innovative efforts to use data to improve education began to take shape, those without a clear understanding of the benefits of data mining in education have stopped progress in its tracks, often because of misguided privacy fears. Now, bills introduced in the House and Senate could pave the way for more innovative uses of education data – but the bills simultaneously include provisions that restrict the potential for data-driven improvements to educational outcomes.

Instead of this lukewarm approach, policymakers should actively support data mining in education and preserve the capacity to use this data effectively, while pushing back at overly restrictive, and even damaging, efforts to protect student data in the name of privacy.


The Student Digital Privacy and Parental Rights Act of 2015, introduced in the House by Representatives Luke Messer (R-IN) and Jared Polis (D-CO), importantly recognizes that there is little potential harm in freely using de-identified student data and explicitly exempts anonymized data from any new restrictions. The bill also makes an effort to improve the quality of data collected in the classroom by allowing parents to access their child’s profile and correct faulty information. At the same time, the bill would prevent companies that operate education data services from disclosing student data for targeted advertising.

This provision no doubt has the right intentions, but it fails to grant exemptions for uses that many students and institutions would find helpful. For example, universities and job placement services could use this information to develop better outreach and recruiting efforts that could help increase student access to higher education and employment opportunities.

The Protecting Student Privacy Act of 2015, introduced in the Senate by Ed Markey (D-MA) and Orrin Hatch (R-UT), contains many similar provisions to its Senate counterpart, but also requires education stakeholders to adopt practices centered on data minimization—the concept of very narrowly collecting, retaining, or otherwise using data. Specifically, the bill requires all identifiable student data be destroyed when it is no longer needed for a specific, predefined purpose. Data minimization precludes the potential for analysis of detailed historical data, which can be an enormously valuable resource as educators encounter new challenges and researchers tackle problems in new ways.

For example, educators have long recognized that good early childhood education is beneficial, but research previously indicated that the benefits were only temporary, as indicated by a gradually decreasing boost to test scores. In 2010, researchers at the National Bureau of Economic Research revisited the same data used in previous studies—data on 11,571 student participants in an education experiment from the 1980s—and combined it with tax data to understand how kindergarten education affected adult outcomes, such as home ownership and income, rather than just test scores.

This analysis revealed that a good kindergarten education is far more beneficial in the long-term than previously thought, with the potential to increase adult income levels and the rates of homeownership and college attendance.


These bills signal that some policymakers are aware and supportive of the benefits of education data mining, but that they still want to tread lightly and appease parents who may not be well educated in how data mining works. It is important to note that that the majority of the privacy and security concerns that these bills attempt to mitigate are already addressed in the Family Educational Rights and Privacy Act of 1974, more commonly known as FERPA.

Still, some in Congress seem to be opposed to the concept of education data mining in general, and have introduced legislation to impose much more stringent limitations on the use of student data.

The Student Privacy Protection Act, introduced by Senator David Vitter (R-LA), would substantially reduce the valuable research potential of student data. Among several damaging provisions, the bill prevents funding—which effectively bans—education agencies from matching student data with personally identifiable information from other government agencies.

This would actually prevent research like the aforementioned kindergarten impact study from happening, as the researchers matched tax records to student data. Furthermore, the bill requires that state longitudinal data systems—a valuable resource states use to monitor education outcomes—only use aggregate data. However, these systems already have their own privacy controls and aggregating student data in these systems reduces their effectiveness.

As the vocal response to previous education data mining projects and Senator Vitter’s new bill indicate, there is a sizeable population that is either unfamiliar with, and thus fearful of, the technology involved or convinced that analysis of student data is necessarily a damaging privacy risk that cannot be handled responsibly. These sentiments are based on false pretenses yet nonetheless seem to have tempered other policymaker’s efforts to make real headway on the issue.


National and state policymakers and educators have a crucial responsibility to be on the forefront of support for innovative approaches to education, be it by funding pilot programs and research, or simply by consistently communicating the benefits of mining education data. Without this kind of enthusiastic approach, education in the United States will continue to flounder.

Joshua New is a policy analyst at the Center for Data Innovation. His research focuses on methods of promoting innovative and emerging technologies as a means of improving the economy and quality of life.

Latest Podcasts