The pace of our modern world, and the impressive volume of data we collect on a daily basis, can be dizzying. Take for example, the hour-by-hour updates and colorful dashboards made by news outlets as they track the spread of novel coronavirus (Covid-19). Organizations need quick and consistent solutions for exploring, analyzing, and acting on...
While it might be tempting to liven up a report or presentation with a few 3D graphs, two-dimensional representation is generally better when numbers are the primary information you want to communicate. Nevertheless, on occasions when numeric values aren’t the primary focus, and you’re more interested in showing the shape of the data, adding a...
Interaction Design for Data Exploration Visualizations capable of launching detail views can add value to a data analyst’s user experience. Programming in this kind of interaction automates the creation of complementary charts and increases ease of exploration by linking varied views of the data in a logical way.
At Red Oak Strategic, we utilize a number of machine learning, AI and predictive analytics libraries, but one of our favorites is h2o. Not only is it open-source, powerful and scalable, but there is a great community of fellow h2o users that have helped over the years, not to mention the staff leadership at the company is very responsive...
Introduction This post will demonstrate how to use machine learning to forecast time series data. The data set is from a recent Kaggle competition to predict retail sales. You will learn how to: Build a machine learning model to forecast time series data (data cleansing, feature engineering and modeling) Perform feature engineering to...
Imagine someone were to mint a new coin and give it to you. You then buy something and give it to someone else, and so on. Every time this coin changes hands, a record of the transaction is engraved on the coin. Every transaction in the history of that coin’s existence is in plain sight on the face of the coin—and the longer the coin...
Introduction In Part 1, we built an application to geographically explore the 500 Cities Project dataset from the CDC. In this post, we will demonstrate other exploratory data analysis (EDA) techniques for exploring a new dataset. The analysis will be done with R packages data.table, ggplot2 and highcharter.
Applying Regular Expressions This is a tutorial on processing data with regular expressions using Python. It is also a reflection on the advantages and trade-offs that come into play when you use regular expressions. Once you have identified and defined a set of patterns, you can strategically search and extract data from raw text according...
Exploratory data analysis (EDA) is generally the first step in any data science project with the goal being to summarize the main features of the dataset. It helps the analyst gain a better understanding of the available data and often can unearth powerful insights. Data visualization is the most common technique in EDA. During this post, I...
Frequently, we encounter projects that require the combined use of Python, Microsoft Excel and some external databases that can only be accessed via Excel, or use cases that require the end product to be output to that format. Excel is still used as a key program for the vast majority of businesses and we are often challenged to create...
Background Recognizing an opportunity to expand...
Time and again, across Red Oak Strategic’s...
The pace of our modern world, and the impressive...
While it might be tempting to liven up a report...
Interaction Design for Data Exploration...
- 2016 Election
- Apache Spark
- Business Intelligence
- Case Studies
- Data Processing
- Data Science
- Data Visualization
- Donald Trump
- Exploratory Data Science
- Financial Analytics
- Hillary Clinton
- Machine Learning
- Political Analytics
- Predictive Analytics
- Private Equity
- Python 3
- R Shiny
- Sparkling Water
- Time Series