The pace of our modern world, and the impressive volume of data we collect on a daily basis, can be dizzying. Take for example, the hour-by-hour updates and colorful dashboards made by news outlets as they track the spread of novel coronavirus (Covid-19). Organizations need quick and consistent solutions for exploring, analyzing, and acting on...
While it might be tempting to liven up a report or presentation with a few 3D graphs, two-dimensional representation is generally better when numbers are the primary information you want to communicate. Nevertheless, on occasions when numeric values aren’t the primary focus, and you’re more interested in showing the shape of the data, adding a...
Interaction Design for Data Exploration Visualizations capable of launching detail views can add value to a data analyst’s user experience. Programming in this kind of interaction automates the creation of complementary charts and increases ease of exploration by linking varied views of the data in a logical way.
At Red Oak Strategic, we utilize a number of machine learning, AI and predictive analytics libraries, but one of our favorites is h2o. Not only is it open-source, powerful and scalable, but there is a great community of fellow h2o users that have helped over the years, not to mention the staff leadership at the company is very responsive...
Introduction This post will demonstrate how to use machine learning to forecast time series data. The data set is from a recent Kaggle competition to predict retail sales. You will learn how to: Build a machine learning model to forecast time series data (data cleansing, feature engineering and modeling) Perform feature engineering to...
Introduction In Part 1, we built an application to geographically explore the 500 Cities Project dataset from the CDC. In this post, we will demonstrate other exploratory data analysis (EDA) techniques for exploring a new dataset. The analysis will be done with R packages data.table, ggplot2 and highcharter.
Exploratory data analysis (EDA) is generally the first step in any data science project with the goal being to summarize the main features of the dataset. It helps the analyst gain a better understanding of the available data and often can unearth powerful insights. Data visualization is the most common technique in EDA. During this post, I...
Frequently, we encounter projects that require the combined use of Python, Microsoft Excel and some external databases that can only be accessed via Excel, or use cases that require the end product to be output to that format. Excel is still used as a key program for the vast majority of businesses and we are often challenged to create...
Our team recently designed a dashboard using R Shiny Leaflet allowing users to select many locations at one go on an interactive map. We created the map using the package leaflet.extras, which enables users to draw shapes on R Shiny Leaflet maps. When combined with the package sp and a function called findLocations, the leaflet.extras drawing...
The apply function in R is used as a fast and simple alternative to loops. It allows users to apply a function to a vector or data frame by row, by column or to the entire data frame. Below are a few basic uses of this powerful function as well as one of it’s sister functions lapply. There are other functions in the apply family (sapply,...
Background Recognizing an opportunity to expand...
Time and again, across Red Oak Strategic’s...
The pace of our modern world, and the impressive...
While it might be tempting to liven up a report...
Interaction Design for Data Exploration...
- 2016 Election
- Apache Spark
- Business Intelligence
- Case Studies
- Data Processing
- Data Science
- Data Visualization
- Donald Trump
- Exploratory Data Science
- Financial Analytics
- Hillary Clinton
- Machine Learning
- Political Analytics
- Predictive Analytics
- Private Equity
- Python 3
- R Shiny
- Sparkling Water
- Time Series