Originally published in the Statistical Atlas of the United States in 1898, Larry Gormley of HistoryShots (a company that designs and restores data visualizations) first came across this old census visualization over at David Rumsey’s online map database. Compelled by its restrained use of shapes, colors, and lines, Gormley, who scours map and book fairs in his native New England, eventually tracked down a printed copy to restore.
… Its design manages to neatly display over 450 data points using only 10 colors to differentiate dozens of cities. Once the viewer adjusts their eyes to the right-to-left timeline, one can see just how much the U.S. had grown in its first full century.
The Basic Wine Guide infographic from Wine Folly is full of the helpful tips one needs when trying to understand wine etiquette. This infographic has tips about what glass each wine should be in, what the wine should be eaten with, some tasting tips, and other things.
Reference: The Basic Wine Guide
Map-Reduce is on its way out. But we shouldn’t measure its importance in the number of bytes it crunches, but the fundamental shift in data processing architectures it helped popularise.
You’ve been seeing optical illusions probably since kindergarten. They’re fun little party tricks that you look at on the Internet and go “Eh, that’s weird” before immediately forgetting about them. And that’s too bad, because these images are actually exposing glaring gaps in our brain’s fragile sense of reality.
The R online training site DataCamp has created an infographic comparing R, SAS and SPSS. Provocatively titled “Statistical Language Wars”, the infographic compares the history, purpose, ease of learning, popularity and marketability of skills in each of the three systems.
Reference: An infographic comparing R, SAS and SPSS
Real time fraud detection is one of the use cases, where multiple components of the Big Data eco system come into play in a significant way, Hadoop batch processing for building the predictive model and Storm for predicting fraud from real time transaction stream using the predictive model. Additionally, Redis is used as the glue between the different sub systems.
In this post the author will go through the end to end solution for real time fraud detection, using credit card transactions as an example, although the same solution can be used for any kind of sequence based outlier detection. The author will be building a Markov chain model using the Hadoop based implementation in his open source project avenir. The prediction algorithm implementation is in his open source project beymani.