An automated machine learning platform called Auto Tune Models (ATM) from MIT and Michigan State University uses cloud-based, on-demand computing to speed data analysis. -MIT and Michigan State University, 2017
ATM was able to deliver a solution better than the one humans had come up with 30% of the time, and could do this 100x faster. -MIT and Michigan State University, 2017
Well, this morning, everyone should already know the winner and looser of the 2016 US presidential election. So, based on the spirit of this blog, let’s look at the winner and looser in infographic and predictive analytics as a result of the 2016 US presidential election.
Winner: Google Election
Google made it so easy to understand election status and results in one simple page with tabs for overview, president, senate, house, .., etc. The bar between the two presidential candidates is so clear that we know who is winning and who is catching up, as well as how many electoral votes are needed to win.
I especially like the semi-transparent red and blue to indicate the majority stakeholders by states and the status of swing states in the lower part of the dashboard. In contrast to what Google is doing, other media does not use the semi-transparent colors for the remaining states so it becomes less clear who would win the rest of the electoral votes and is difficult to see the trend.
On the other hand, the bar between the candidates in the Google dashboard diffuses the bias introduced by the area chart of the US map (implying large areas having more electoral votes).
The only item I would add to the Google election dashboard is to apply the semi-transparent colors to the bar between candidates as well. This would make the dashboard perfect.
In summary, Google election dashboard does a excellent job for the US presidential election. It brings clarity to both the status and trend of the election results in a very precise manner. It deserves to be the winner of BI dashboard design for this election.
Looser: Predictive Analytics
Predictive analytics does a very poor job in this presidential election. All predictive models consistently say Mrs. Clinton would win the election over Mr. Trump. As we all know this morning, the prediction is a total failure.
538 is a pretty popular site about predictive analytics. The following image shows you its prediction during the night of the election results. The forecast had been in favor of Mrs. Clinton until 10 PM, when the curve started to switch to the favor of Mr. Trump. And, the switch did not become evident until 11:30 PM (at the big gap where red line above blue line in the lower part of the chart), when some people could already tell the trend before the forecast trend.
However, we probably should not blame predictive analytics for such big failure. It is because the strength of predictive analytics is to predict “major trend”, not a single outcome; and most of our predictive analytics today rely solely on “data” and nothing else.
As I pointed out in my recent blog on my site and KDnuggest site, if predictive analytics is purely based on data without understanding the underlying process, its forecast is subject to noise and bias in the data and could be very inaccurate. This becomes evident during this presidential election. Because all data were bias towards Mrs. Clinton, it predict Mrs. Clinton to win.
In addition, since the presidential election result is a single outcome and involves a lot of human factors which cannot be quantified analytically, predictive analytics may not be the right tool for the prediction at all! The totally failed prediction makes predictive analytics the looser of this election.
Predictive analytics still works well in a controlled context, but may not be the right tool for election prediction unless we are able to (1) quantify human factors and correlations accurately, (2) do not depend solely on data, and (3) fully disclose the prediction errors.
This is a very good data insight.
… The Obama administration for years has been pleading with states to expand their Medicaid programs and offer health coverage to low-income people. Now it has a further argument in its favor: Expansion of Medicaid could lower insurance prices for everyone else.
… By comparing counties across state borders, and adjusting for several differences between them, the researchers calculated that expanding Medicaid meant marketplace premiums that were 7 percent lower.
Reference: Expanding Medicaid may lower all premiums
The ability to use data to achieve enterprise goals requires advanced skills that many organizations don’t yet have. But they are looking to add them – and fast. The question is, what type of big data expert is needed? Does an organization need a data scientist or does it need a business analyst? Maybe it even needs both. These two titles are often used interchangeably, and confusion abounds.
Business analysts typically have educational backgrounds in business and humanities. They find and extract valuable information from a variety of sources to evaluate past, present, and future business performance – and then determine which analytical models and approaches will help explain solutions to the end users who need them.
With educational backgrounds in computer science, mathematics, and technology, data scientists are digital builders. They use statistical programming to actually construct the framework for gathering and using the data by creating and implementing algorithms to do it. Such algorithms help businesses with decision making, data management, and the creation of data visualizations to help explain the data that they gather.
This is so far the best article that I have been reading about the Big Data. It is what I have been advocating to people.
1. They talk about “bigness” and “data,” rather than “new questions”
… It seems most of the tech industry is completely drunk on “Big Data.”
… most companies are spending vast amounts of money on more hardware and software yet they are getting little, if any, positive business value.
… “Big Data” is a terrible name for the revolution going on all around us. It’s not about Bigness, and it’s not about the Data. Rather, it’s about “new questions,” being facilitated by ubiquitous access to massive amounts of data.
… If all you’re doing is asking the same old questions of bigger amounts of the same old data, you’re not doing “Big Data,” you’re doing “Big Business Intelligence,” which is itself becoming an oxymoron.
2. They talk about technology, rather than business
… You may end up with the world’s largest server cluster, but other than bragging rights, who cares? START with a business issue, figure out how to better-characterize that issue with data, THEN start working on a technical solution.
3. They focus on insights, rather than actions
Most of the organizations that I work with are so focused upon analytics as an end-result they completely miss the whole point of this Big Data exercise: better actions. … If, after all of this effort, you haven’t changed how your organization acts, what your product or service does for your customers, or how you subsequently respond to the world around you, you’ve failed, utterly.
… Insight is great, but action is what brings home the bacon. If your “Big Data Expert” is focused on gaining insight rather than generating new business outcomes, you’re running a science experiment.
4. They talk about conclusions, rather than correlations
… Many of this new wave of Big Data experts don’t understand the nuance between correlation and causation. … Correlation means that there is the appearance of a relationship between things. Such relationships may indicate that certain inputs MAY lead to certain outputs. But, with correlation, there is no certainty.
… This is sort of a bummer to business people, who like to work with absolutes, or at least the appearance of absolutes. Well, there’s no such thing in data analytics. Your data may represent a vast collection of facts, but analytics and statistics are theater. What you see isn’t always what you get. Indeed, many “data scientists” are more “data manipulators,” generating politically acceptable outputs that support a given agenda.
… Correlation does not guarantee causation. Any Big Data expert who tells you they found causation should be immediately suspect until proven otherwise.
5. They talk about data quality, rather than data validity
… While data quality matters, it’s far more important to focus on data validity: Do I even have the right data to answer the questions I’m asking? … New analyses require VALID data, but determining whether or not data is clean before asking questions of it makes no sense whatsoever.
6. They sound like everyone else who is talking Big Data
… We are being drowned in all of the noise surrounding Big Data. … If your “Big Data Experts” don’t get this, then they’re not getting it. And neither are you.
Reference: Six signs that your Big Data expert, isn’t
It’s interesting that some people come up new ideas of creating infographics in Excel. I wonder how data navigation is going to be supported!?
Reference: Info Graphics with Excel
For people who like to collect cheat sheets, here is another collection of them.