I was reminded today by an email form a colleague of a famous quote:
I waste half my money on advertising, trouble is I don’t know which half
– John Wanamker
There has been a lot of talk around this latest US election. Both parts spent billions of dollars vying for a couple of percentage points in the vote. Prior to this election the most accurate predictions were coming from pundits and those heavily involved in either of the campaigns.
Why this election was different
Serious analysts like Nate Silver are using increasingly accurate statistical methods to predict not only the sway of public opinions, but also on voter behaviour. This has led some, including the UK’s Guardian, to predict that punditry is dead. To put this in context Nate Silver correctly predicted the outcome in all 50 states – every single one – in a country of 200 million people. What’s even more intriguing (or exciting if you’re a math geek like me) is that he managed to call most individual Senate races correctly too.
Unlike UK politics where voters tend to vote by party and could seldom tell you the name of their local MP, US Senate races stir up a lot of talk local to the state. I used to live in Wisconsin and still have many friends there whose chatter about state politics is much greater than anything I hear over here in the UK. Senate and House races are decided on individual candidates’ actions during the campaign, and on local issues. You can see this by how often the Presidential vote varies from the results in the House of Representatives.
So how did Nate Silver do it? How did he as one man predict all these races with such high accuracy? No one truly knows exactly how he managed it. He certainly didn’t just rely on talking to local activists as Karl Rove did. This got him and Republicans unwanted coverage arguing live on Fox News during election night. Some have said that his analysis relied on aggregating polls and that he tested hypotheses using Bayesian methods. Whatever he used he’s unlikely to give out his ‘secret sauce’, especially given his support for the Democrats.
The thing is, I don’t think there would be enough local polls over time for him to base his predictions on. He will have had to perform analysis over people opinions, how they change, and track these over time and geospatially. In short he would have had to collect public sources of information, such as Twitter and Facebook. This is called ‘Open Source Intelligence’ and is increasingly used by organisations to determine where to focus their efforts, or in Nate Silver’s case the likely outcome of an election.
What has this result to tell us about Big Data analytics in general?
The election in 2008 showed us that mobilising an army of opinion on social media could help you win the White House. The 2012 election has shown us that effectively mining this information is key in predicting voter output. I don’t think it is a stretch to say that by 2016 Big Data Analytics will be used extensively to micro target political parties’ efforts and resources.
In the world of business these methods can similarly help focus money. Why not search social media for the variety of opinions of your products? Why not see what other products people are using them with. Maybe you’ll find a link between Gatorade (Or Lucozade for us Brits) and Flu. Should you target your Flu medicine to the same market as a drinks company is soon targetting? Why not find opinion makers on social networks and target them one to one in the same way you talk to your paid-for analysts now?
In the Public Sector, which is the area I cover for MarkLogic, there are other interesting areas you could use this technology. We’re not trying to make a profit, but we are trying to spend ever decreasing funds in a more targeted fashion. Perhaps the Health Protection Agency wants to send alerts to local health protection teams when there is a statistically significant spike in mentions of symptoms for a disease in their area on Twitter. Maybe the Department for Transport wants to be alerted to mentions of pot holes, debris on motorways, or local traffic problems.
The London Riots was also a big wake up call for many in Government. Being able to analyse complex relationships between people across social networks when mass disorder and crime is being planned – quite in the open – is something even a local police force is now going to have to come to terms with. It was striking during last years Riots that the forces that most quickly arrested those planning disorder were the larger ones who probably have a dedicated IT team. How are local regional forces going to cope if such an event happened in their patch?
Sentiment analysis can also be a key thing to keep track of. Much has been made of the recent revelation that David Cameron has a custom iPad dashboard to keep track of Government statistics. Naturally it took The Register to provide a sensible analysis of what Government is likely to actually be able to achieve with their current data. There is currently no single method for publishing all Government data, in the same set of formats, and no single real time statistical analysis package to correlate all this information and actually make use of it.
There certainly isn’t a system that can combine a list of most likely candidates for hospital closures with local sentiment about hospital and the health service in general. Perhaps one day soon though as more people take not of how useful these approaches are, such a system will become a reality.