At Red Oak Strategic, we utilize a number of machine learning, AI and predictive analytics libraries, but one of our favorites is h2o. Not only is it open-source, powerful and scalable, but there is a great community of fellow h2o users that have helped over the years, not to mention the staff leadership at the company is very responsive...
Introduction This post will demonstrate how to use machine learning to forecast time series data. The data set is from a recent Kaggle competition to predict retail sales. You will learn how to: Build a machine learning model to forecast time series data (data cleansing, feature engineering and modeling) Perform feature engineering to...
Introduction In Part 1, we built an application to geographically explore the 500 Cities Project dataset from the CDC. In this post, we will demonstrate other exploratory data analysis (EDA) techniques for exploring a new dataset. The analysis will be done with R packages data.table, ggplot2 and highcharter.
Exploratory data analysis (EDA) is generally the first step in any data science project with the goal being to summarize the main features of the dataset. It helps the analyst gain a better understanding of the available data and often can unearth powerful insights. Data visualization is the most common technique in EDA. During this post, I...
Frequently, we encounter projects that require the combined use of Python, Microsoft Excel and some external databases that can only be accessed via Excel, or use cases that require the end product to be output to that format. Excel is still used as a key program for the vast majority of businesses and we are often challenged to create...
Political polling faces a crisis of confidence. Major news outlets repeatedly ask “What’s the matter with polling?” after major misses like the Bernie Sanders’s primary upset in Michigan, where he beat Hillary Clinton 50–48 despite the fact that she was leading by up to 20 points in reputable polls. There is, however, hope for a polling...
Last Saturday, in what has now been widely publicized and discussed, Uber and Lyft lost an effort, Proposition 1, that would have rolled back a number of regulations on their services. As a result, in one of America’s most forward-thinking tech centers, the services stopped operating almost immediately.
Lots of discussion and criticism came out of this past Tuesday’s election results, much of it focusing on the Kentucky Governor’s race and the lack of reliable polling measures, which lead to a somewhat surprising election night result for many pundits. One particular story, written here by @ForecasterEnten, made very valid points about the...
Background Recognizing an opportunity to expand...
Time and again, across Red Oak Strategic’s...
The pace of our modern world, and the impressive...
While it might be tempting to liven up a report...
Interaction Design for Data Exploration...
- 2016 Election
- Apache Spark
- Business Intelligence
- Case Studies
- Data Processing
- Data Science
- Data Visualization
- Donald Trump
- Exploratory Data Science
- Financial Analytics
- Hillary Clinton
- Machine Learning
- Political Analytics
- Predictive Analytics
- Private Equity
- Python 3
- R Shiny
- Sparkling Water
- Time Series