Red Oak Strategic
  • Home
  • About Us
  • Services
  • Amazon Web Services
    • Database Engineering
    • Machine Learning and AI
  • Resources
    • Blog
    • Case Studies
Contact Us
  • Home
  • About Us
  • Services
  • Amazon Web Services
    • Database Engineering
    • Machine Learning and AI
  • Resources
    • Blog
    • Case Studies
Mark Stephenson
Friday, 26 August 2016 / Published in Data, Politics, Analytics, Political Analytics, Data Science, Predictive Analytics, 2016 Election, Polling

Could Google and Xbox solve all of polling’s problems?

Political polling faces a crisis of confidence. Major news outlets repeatedly ask “What’s the matter with polling?” after major misses like the Bernie Sanders’s primary upset in Michigan, where he beat Hillary Clinton 50–48 despite the fact that she was leading by up to 20 points in reputable polls. There is, however, hope for a polling renaissance thanks to online questionnaires and advanced statistical methods, some of which we are employing during the 2016 cycle. These approaches can take snapshots of the electorate and produce a balanced and accurate election prediction. Using this strategy, web companies may soon use these methods to offer a plausible and accurate alternative to traditional phone polling.

So, what truly is the matter with current polling (if anything)?

In a nutshell, less and less people are answering their phones when pollsters call, and young voters are doing so even less than their parents. Currently, most pollsters try to contact a group of people who look demographically similar to the population as a whole, and with a few adjustments, this sample can offer an accurate picture of how the country or state intends to vote.

To gather this representative sample, many pollsters have relied on calling random landline numbers, under the assumption that if you dial enough numbers the people you call will not have any demographic biases, like being older or whiter or poorer than the true population. Others, like our firm, use list-based sampling methods to ensure quality.

However, fewer and fewer Americans have landlines, and even fewer under-35 voters still use them, so this strategy now struggles to produce accurate snapshots of the general population. Calling cell phones can offer a somewhat more representative sample, but connection rates are difficult and the cost of manually calling large number of cellphones is much higher. As a result, ensuring accurate results in polling has become more and more expensive.

Internet surveys offer a promising possible solution to these problems.

New research from Professor Andrew Gellman and coauthors out of Columbia University and Microsoft shows that cheap, non-representative internet polling with rich demographic information can produce results that are as accurate as the most expensive traditional polling. In their experiment, the authors conducted surveys via Xbox in the lead up to the 2012 election. Obviously, people who answer surveys via Xbox do not look like the general population any more than people who still have landlines. The distinct advantage is that these cheaper online surveys can collect many more responses than traditional phone polls. As a result, the researchers were able to ask the requisite demographic questions to accurately translate these results to the wider population.

The ability to project these polls onto the wider electorate is the key advantage of online surveys. In the Xbox poll, the authors asked for the usual demographic information like race and gender, but also attitudinal questions like whether the respondent identified as liberal or conservative and who they voted for in the 2008 election. This allowed the researchers to chop up the Xbox sample into remarkably small groups based on these characteristics and use a statistical method call multilevel regressions and post-estimation (MRP) to figure out how each of these small groups felt about the 2012 election. Even though there were few older women in the sample, the authors still managed to estimate the preferences of this group to within one percent of national exit poll estimates. The authors then used 2008 exit polls to figure out how many people from each group actually resided in the general population, and combined these exit polls with the Xbox survey data to predict the overall election results.

This approach compared favorably to leading traditional polls and the actual election results. As the following figure shows, the Xbox survey roughly aligned with an average of leading presidential polls in the 45 days before the 2012 election and in the final run up to Election Day the Xbox survey was closer to the final result than traditional polling.

This approach produced accurate, low cost results even though the Xbox poll was likely less representative of the general population than traditional polling. While the poll did capture more young voters than traditional methods, this was not the key to their success. Rather, the rich demographic data allowed the authors to thoroughly correct for the differences between their sample and the general population.

These results are more than academic abstractions — they offer a viable path forward for groups who wish to test an alternative to the rising cost of traditional polling methods. The chase for representative samples is getting harder and harder since there are so few technologies which the whole country uses in equal amounts, but these sorts of demographically rich non-representative samples offers a viable alternative. If there is sufficient enthusiasm and understanding from political campaign and corporations, internet giants such as Google and its Google Consumer Surveys division could soon offer the type of cheap, Xbox-style polling data which statisticians can turn into accurate, representative results. If a web company manages to fully embrace the potential of this method, accurate polling may soon be a much cheaper and more accessible option for everyone from local campaigns to major national brands.


  • Tweet
Tagged under: Data Politics Analytics Political Analytics Data Science Predictive Analytics 2016 Election Polling

What you can read next

Business Intelligence Across a Private Equity Portfolio
Data Visualization: Empowering Decision Makers
Draw Rotatable 3D Charts in R Shiny with Highcharts and JQuery

Leave a reply

    Recent Posts

    • Business Intelligence Across a Private Equity Portfolio

      Background Recognizing an opportunity to expand...
    • Data Visualization: Empowering Decision Makers

      Time and again, across Red Oak Strategic’s...
    • Tracking Coronavirus: Building Parameterized Reports to Analyze Changing Data Sources

      The pace of our modern world, and the impressive...
    • Draw Rotatable 3D Charts in R Shiny with Highcharts and JQuery

      While it might be tempting to liven up a report...
    • Customizing Click Events: How to Capture and Store Data from JavaScript Objects in R Variables

      Interaction Design for Data Exploration...

    Categories

    • 2016 Election (6)
    • Analytics (11)
    • Apache Spark (1)
    • Blockchain (1)
    • Business Intelligence (1)
    • Case Studies (3)
    • Code (12)
    • Data (14)
    • Data Processing (2)
    • Data Science (18)
    • Data Visualization (8)
    • Databases (1)
    • Donald Trump (1)
    • Excel (1)
    • Exploratory Data Science (1)
    • Financial Analytics (1)
    • Forecasting (1)
    • ggplot2 (1)
    • h2o (1)
    • Highcharts (1)
    • Hillary Clinton (1)
    • JavaScript (3)
    • JQuery (1)
    • Machine Learning (3)
    • Maps (1)
    • Political Analytics (3)
    • Politics (7)
    • Polling (3)
    • Predictive Analytics (3)
    • Private Equity (1)
    • Python (2)
    • Python 3 (1)
    • R (10)
    • R Shiny (4)
    • RegEx (1)
    • RShiny (2)
    • Sparkling Water (1)
    • Time Series (1)
    • Tutorial (1)
    • Tutorials (8)
    • Uber (1)
    see all topics

    © 2022 Red Oak Strategic

    KEEP UPDATED

    Receive our updates, best practices and latest news straight to your inbox