White House: Want data science with impact? Spend ‘a ridiculous amount of time’ with people

White House Chief Data Scientist DJ Patil says effective data science demands a collaborative process with citizens.

OAKLAND, Calif. — On Thursday, White House Chief Data Scientist DJ Patil met with civic technologists at the Code for America Summit to underscore the need for collaborative design and to offer advice on effective data science for government.

LinkedIn’s former head of data products kept his answers trained on practical advice, while avoiding any “what’s next” questions targeted at President Obama’s impending departure. Patil focused on the emerging role of data as a tool for change. The U.S. has always had data scientists, he said, but a new era of open data, application programming interfaces and exponential growth of new technologies is elevating the role of open data.

“What’s new is how we actually take and bring this data together across the government, open it back up, put it out there for the public and then allow people to build with it,” Patil said. “That requires a different approach of not just the economists and statisticians, but for us to think, ‘How are we going to put data out so people can build with it?’”

This amplified skill set for federal and local governments enables a wider variety of services to be developed from the private sector, nonprofits, academia and civic hacking groups, Patil explained, and pointed to the Obama administration as a living example of an entity that pushed their data programs forward through this kind of collaboration.


One example is the president’s Police Data Initiative, a program launched in 2015 that has partnered with more than 50 police departments to release information on crime and officer use-of-force data. There is the Opportunity Project, launched in May, to act as a hub for open data apps and high value data sets. In June, Patil announced Data Driven Justice, an initiative that received commitments from 67 state and local governments to use data analytics to divert or bailout low-level offenders from jail — a move that could reduce the roughly $22 billion in annual incarceration costs shouldered by local governments.

The activity in this space will increase the demand for chief data scientists and chief data officers as cities look for new ways to serve their growing populations, Patil said.

“If we tell the country that we have a problem and we don’t know how to solve it [and say,] ‘We need your help to solve it,’ then we see new ways we can open up the data for people to actually combine it with other data sets to be able to solve problems.” Patil said. “The trick to getting that conversation to happen is to first get the data scientist to be in the room.”

Yet, just “getting it out there” isn’t enough. Patil explained that effective analytics almost always requires human participation from end users, citizens, government employees, and anyone else who is to benefit from the data.

“With our criminal justice initiatives, whether it’s our Police Data initiative or Data Driven Justice — those all originated because we literally spent a ridiculous amount of time with people,” Patil said.


The tactic of deep participation with all involved parties was cemented as a best practice during the beginnings of the White House’s Precision Medicine Initiative (PMI), he explained. When officials first began to plan that program, which works to develop medical treatments for patients based on their genetic makeups and lifestyles, one of the critical questions was who would participate. Leaders from hospitals, government agencies, researchers and companies wondered if it was wise for average citizens — people like caregivers and individual patients — to be a part of a national cohort for feedback, Patil recalled.

“When that first came about, the whole community said, ‘Wow, wait, you know we usually have a group that advocates on behalf of those people,’ and I said, ‘No, they are going to be at the table,’” Patil said. “And that changed the dynamic dramatically when we thought about what is the patient’s access to data and the responsibility we had to them throughout the whole process.”

Further, by having users at the table, it inspires data scientists and developers to identify effective data sets and iterate on data sets they’ve already published, he said.

“It’s when we do this collectively,” Patil said, “and the team is all engaged when we get to that experimental frame of mind.”

Latest Podcasts