Analytic Partners’ President, Nancy Smith, explores the similarities between the US Presidential Election and Predictive Analytics.
As we all prepare for a big changing of the guard in US politics, it is worth considering how and why the pollsters got their election forecasts so spectacularly wrong, and how that relates to the data driven marketing and predictive analytics industry as a whole.
There has been a variety of finger-pointing for poor election forecasts including poor mobile polling, people lying to pollsters, sampling errors, systematic bias, people just plain refusing to pick up their phones, to poor measurement metrics.
To me, it comes down to a turn of phrase coined during a previous Clinton campaign: It’s the data, stupid.
The election polling was a great example of letting bad data drive without an examination of bias. There's an art and science to any modeling and forecasting and most of the pollsters clearly got it wrong. Part of the art is dealing with poor data, with biased data, with incomplete data. Your science may be solid, but without the skillset, foresight, and willingness to question the data going in (the art), your algorithms (the science) are destined to provide inaccurate results.
We see much of the same issue within the data-driven analytics space, where data – big and small – may be biased and lead to bad decisions, and even worse, like our recent presidential election, bad forecasts. For example, recently Multi-touch Attribution and Marketing Mix Modeling have been the source of much debate. Much of that debate and discussion is focused on the capability, methodology, technology and practical application. Unfortunately, this debate misses a critical ingredient for any robust analysis – the data. This occurs largely because we assume that data granularity and accuracy is sufficient without enough scrutiny.
Data robustness is often overlooked or taken for granted resulting in poor results despite the most advanced scientific methodologies and sophisticated analytic professionals and data scientists applying the results. Data robustness can be determined based on its comprehensiveness, depth, consistency and accuracy. Here are a few questions you should be asking in order to vet your data robustness and ensure your analysis will produce a strong and accurate result:
- Is my data comprehensive or am I missing key customer segments or channels?
- Is the data that I have access to sufficient to answer my key questions?
- Has my data been selected objectively, is it random?
- Is my data based on projections, or estimates that may impact accuracy?
- Is my data consistent – either from the same source, or consistent over time?
- Has an independent party (or have fresh eyes) looked at my data and my objectives to make sure I am not missing something? (Please note, the cost of a poor analysis outweighs the costs of double checking your data!)
If after asking the above questions, you find data quality issues, you have an opportunity to clean or source new data. Ensure you review your analysis objectives and align your methodology to the data robustness and the business actions you are trying to impact. If you find your data isn’t as robust, or is biased, but worthy of analysis, you can adjust your forecast risks associated with the outcome. Most importantly, continually build and improve your data and sources so that your accuracy and forecasts improve.
President-elect Trump will be sworn in a couple of weeks. Overwhelmingly, election pollsters predicted and expected an entirely different result. Ultimately, the pollsters fell victim to the vulnerabilities that vex all analytics: It’s the data, stupid.
Let's transform your business.
Analytic Partners can help your business adapt.Talk to an expert