A Brilliantly Restored 19th Century Visualization of U.S. City Population Shifts

Originally published in the Statistical Atlas of the United States in 1898, Larry Gormley of HistoryShots (a company that designs and restores data visualizations) first came across this old census visualization over at David Rumsey’s online map database. Compelled by its restrained use of shapes, colors, and lines, Gormley, who scours map and book fairs in his native New England, eventually tracked down a printed copy to restore.

… Its design manages to neatly display over 450 data points using only 10 colors to differentiate dozens of cities. Once the viewer adjusts their eyes to the right-to-left timeline, one can see just how much the U.S. had grown in its first full century.

Reference: A Brilliantly Restored 19th Century Visualization of U.S. City Population Shifts

The Basic Wine Guide

The Basic Wine Guide infographic from Wine Folly is full of the helpful tips one needs when trying to understand wine etiquette. This infographic has tips about what glass each wine should be in, what the wine should be eaten with, some tasting tips, and other things.

Reference: The Basic Wine Guide

All of the Health Hazard of Sitting too Long

By now, you already know that prolonged sitting is bad for your body. But what exactly goes on when you sit for hours every day? This graphic from the Washington Post explains.

Reference: All of the Health Hazard of Sitting too Long

The Elephant was a Trojan Horse: On the Death of Map-Reduce at Google

Map-Reduce is on its way out. But we shouldn’t measure its importance in the number of bytes it crunches, but the fundamental shift in data processing architectures it helped popularise.

Reference: The Elephant was a Trojan Horse: On the Death of Map-Reduce at Google

5 Optical Illusions That Prove You Can’t Trust Your Own Mind

You’ve been seeing optical illusions probably since kindergarten. They’re fun little party tricks that you look at on the Internet and go “Eh, that’s weird” before immediately forgetting about them. And that’s too bad, because these images are actually exposing glaring gaps in our brain’s fragile sense of reality.

Reference: 5 Optical Illusions That Prove You Can’t Trust Your Own Mind

An infographic comparing R, SAS and SPSS

The R online training site DataCamp has created an infographic comparing R, SAS and SPSS. Provocatively titled “Statistical Language Wars”, the infographic compares the history, purpose, ease of learning, popularity and marketability of skills in each of the three systems.

Reference: An infographic comparing R, SAS and SPSS

Real Time Fraud Detection with Sequence Mining

Real time fraud detection is one of the use cases, where multiple components of the Big Data eco system come into play in a significant way, Hadoop batch processing for building the predictive model and Storm for predicting fraud from real time transaction stream using the predictive model. Additionally, Redis is used as the glue between the different sub systems.

In this post the author will go through the end to end solution for real time fraud detection, using credit card transactions as an example, although the same solution can be used for any kind of sequence based outlier detection. The author will be building a Markov chain model using the Hadoop based implementation in his open source project avenir. The prediction algorithm implementation is in his open source project beymani.

Reference: Real Time Fraud Detection with Sequence Mining