UNC School of Media and Journalism Reese Felts Distinguished Associate Professor Ryan Thornburg is the driving force behind the creation of NC Votes, a web-based project that turns voting data into public insights about voting and elections in North Carolina.
The iniative evolved from Campaign Hound — a project developed in the Reese News Lab where Thornburg serves as director — and is funded by the John S. and James L. Knight Foundation as part of its News Challenge program.
“I'm a strong believer in the idea that, if we can organize data that's available and turn it into useful products, we can both find new revenue models for journalism as well as increase the efficiency of doing reporting,” Thornburg said. “It lets machines do what machines are good at and allows human to focus on the more important how and why things are happening — not the who, what, when and where that we can find in the data.”
The MJ-school sat down with Thornburg to discuss his involvement with NC Votes, where the project stands and what to expect in the coming months.
Q: How did the project get started?
A: I was sort of the ringmaster of bringing together a lot of different people. We had a really good diversity of people from across campus. We have Scott Smith, a master's student in the statistics department, and Bill Shi, who works in the Odum Institute for Research in Social Science. They've both been doing our data analysis and data cleaning. David Raynor, the database editor at the News & Observer, has been really useful in terms of talking with us about how he and his colleagues would like to use this.
We've hired some outside contractors to do data development work and web development work and design. We've worked with the 1893 Brand Studio at The Daily Tar Heel to help do some design work on the website that we're launching and help us think about our publication communication strategy.
And then we have journalism students thinking about what stories they can write. This is where the reporting students are really adding value to the analysis by doing interviews and understanding what the flaws are in the data and what we shouldn't say.
Q: What was your goal in launching NC Votes?
A: We wanted to create tools for journalists to be able to use the data more easily and answer questions that we know that journalists have, and also to help citizens connect to each other. We hear a lot about the state being divided — either Republican/Democrat or rural/urban — but could we begin to use data to describe voters in different ways than just Democrat/Republican or rural/urban? Are there some shared concerns that, if we can just figure out how to connect groups of people maybe isolate them from each other, might have some positive effect on civil discourse in North Carolina?
We're a long way from doing that, but we're in the first steps. This is a long-term project, and we've got to take the first steps of acquiring and cleaning and organizing the data. One of the things we really believe in the Reese News Lab is iterating. You put out something, figure out what works, build on that success. So that's the process we're in the middle of right now.
Q: How did you get all of the data?
A: All of the data that we have is coming from the N.C. State Board of Elections. They do a really great job of making data available on their website.
What we've done is written programs that automatically pull the data down, look for changes in the data and then organize it and put it together in one big database so that we can build relationships between different pieces of data that the board keeps separate on its website. By connecting the data and looking for patterns across the data sets, it's much easier for us to answer interesting questions.
So for example, if we want to look at how the voter turnout changed in precincts where more retired white men moved in between 2012 and 2016 — we can do that now that we've got it all together.
Q: Where is the project right now?
A: We're in the steps of acquiring and organizing the data right now. That means getting the data in a format that we can use to do the analysis down the road, so that's kind of the next thing that we're going to start to do.
Q: How would journalists use NC Votes?
A: A lot of times, when a reporter goes out to do a story, they interview whoever's readily available. And that often ends up being people that are already in positions of power or people that already have an agenda that they want to promote. Those are maybe the most active citizens.
But with this tool, we can randomly select voters for journalists to interview, at least to reach out to to see if they want to talk and have their voices become part of the public discourse. That way journalists know if they're reaching a diversity of voices in their community. We want to build search interfaces and maps that can answer a lot of the basic questions that a journalist might have when they're working on a story.
Is voter turnout up or down in certain demographics? How is our community different than the community next door? How has voter preferences changed over time? A lot of times, these are little sentences that a journalist might want to throw into a story, but right now it takes too long for them to get that context. Wouldn't it be so much nicer if journalists could just go to a site, get a quick answer, put it into the story and then everybody that consumes that news story gets that context?
Q: How would North Carolina residents use NC Votes?
A: A lot of times, we surround ourselves with other people that agree with us. That's a natural and comfortable thing to do. But maybe there's a way that voters can see through this data, if we visualize it right and tell this story correctly, that there's a diversity of opinions in North Carolina — that even though we maybe hold a diversity of perspectives, there are some things that we can agree on.
Are there ways that we can identify voters that might be different on a lot of things and have them talk to each other about issues that they find common ground on? We're looking at building sites that describe the environment that an individual voter is in. So they might come to the site and say, "My precinct is the most Republican precinct," or "We have undecided voters that always vote Democratic in our precinct." They could begin to understand how their vote matters in a larger context.
Q: What can other states draw from this project?
A: The No. 1 thing is to make the data public and available in a machine-readable format that's current and that's available to anybody on the web. We'd really be sunk if the state of North Carolina didn't have great public records and the Board of Elections wasn't very transparent about what they do. And I think that's the kind of thing that gives people trust in democratic institutions.
The lesson for journalists is to begin to understand and figure out how to use this data to incorporate into reporting, and how can we create tools one time that would work to use over and over again. So every time they have a question, they don't have to recreate the wheel. So really what we're doing is looking at creating efficiency around this kind of reporting. And that's something that I think other states can do if they collaborate. You have to be able to pull people together and collaborate on this kind of stuff.
Q: How might NC Votes look different now than it will in a year?
A: Right now, the site is really for the nerds. It's got a lot of computer code on it, and it has some basic data. But over the next couple of months, we're going to slowly be adding pieces of data to the site that people can use. We're going to take the Board of Elections data and clean it up and each time we do that, we're going to make it available and tell people about it. And then we're going to start adding web-based tools and web apps that people can come and use. Right now, it's really a site for people that want to nerd out with us on elections.
We're looking for civic coders, we're looking for academics, we're looking for students that are interested in working on either politics or data analysis. We're looking for journalists around the state and just curious citizens to let us know what they need. Now that we've got the data cleaned and acquired, how do we build things to suit their needs and to solve their problems, not what we imagine might be their problems? I think that's really important. The reason we're putting this stuff out before we have anything built is so that people can help guide us to build products that they actually would use.