New Jersey Bails Out

This is an interesting podcast talking about a social application of data analytics to fix the bail system of New Jersey.

In New Jersey, defense attorneys, judges, and prosecutors got together to try to reform a system that treated poor defendants so differently from rich ones. In the end: they got rid of bail.

… The Risk Assessment Algorithm predicts the probability of failing to show up in courts and the probability of committing to another crime before the next trial in a 1-to-6 scale. It considers the following factors

  • Does the person have a priori conviction of violance within the past x years?
  • Have the person had failed to appear in courts in priori cases?
  • The age of the person when the crime was committed, and others.

… Data shows that, if your age was under 23 when committing a crime, the person is more likely to commit a new crime than an older person.

… Some specific factors such as race and wealth are excluded in order to prevent bias, whereas keeping the forecast reliable.

… Since the introduction of the algorithm, people are detained with real information. The jail population is reduced by almost 30%. The result is very profound.

Reference: New Jersey Bails Out

Advertisements

All in with online, can J.C. Penney get up to digital speed?

I had a few occasions chatting with the IT people of the company in the past few years. They were reluctant to adapt to the on-line trend of the retail market. One year, they wanted to expand their on-line catalog business; the next year, they closed the on-line catalog business and moves the majority of their IT people overseas in the following years. This time, it appears that the new SVP, Mike Amend, hired from Home Depot, is ready to face the on-line retail business challenges.

This article highlights a lot of positive actions for the company to transition itself from a traditional retail business to an on-line one.

  • Recognizing its market strength: Research from comScore tells Penney that its customers have household incomes of $60,000 to $90,000, and they tend to be hardworking, two-income families living both in rural and urban settings. They don’t have the discretionary income to commit to membership fees.
  • Last month, Penney added the ability to ship from all its stores, which immediately made about $1 billion of store inventory available to online customers and cut the distance between customer and delivery.
  • About 80 percent of a store’s existing inventory is eligible for free same-day pickup.
    Last week, it offered free shipping to stores with no minimum purchase. Large items like refrigerators and trampolines are excluded.
  • JCPenney.com now stocks four times the assortment found in its largest store by partnering with other brands and manufacturers.
  • More than 50 percent of its online assortment is drop-shipped by suppliers and doesn’t go through Penney’s distribution. Categories added range from bathroom and kitchen hardware to sporting goods, pets and toys
  • JCPenney.com now has one Web experience regardless of the screen: phone, tablet or desktop.
  • Its new mobile app and wallet include Penney’s new upgraded Rewards program. Customers can book salon appointments on it. The in-store mode has a price-check scanner.
  • Penney set out to “democratize access to the data,” so that not only the technical staff could understand it, now dashboards and heat maps allow the artful side of the business — the merchants — to measure such things as sales to in-stock levels or pricing to customer behavior.

Reference: All in with online, can J.C. Penney get up to digital speed?

Guns kill nearly 1,300 children in the U.S. each year and send thousands more to hospitals

Handguns and other firearms cause the deaths of more children in the United States each year than the flu or asthma, according to a comprehensive new report on gun violence and kids.

Each day in the United States, an average of 3.5 people under the age of 18 are shot to death and another 15.5 are treated in a hospital emergency department for a gunshot wound. Between 2012 and 2014, an average of 1,287 children and adolescents died each year as a result of gun violence, making firearms second only to motor vehicle crashes as a cause of injury-related deaths. Another 5,790 were treated for gunshot injuries in U.S. hospitals.

Here’s another way to look at it: In the United States, a gun is the cause of death in more than 1 in 10 deaths of people under the age of 18.

The number of child fatalities related to guns is far higher in the U.S. than in any other high-income country. Another study has reckoned that the U.S. accounts for 91% of all the firearms-related deaths of children under 14 in the world’s 23 richest countries.

The new analysis, published Monday in the journal Pediatrics, represents an unusually comprehensive look at the toll that guns take on children.

Reference: Guns kill nearly 1,300 children in the U.S. each year and send thousands more to hospitals

Former Microsoft CEO Launches New Tool For Finding Government Data

This Tax Day, former Microsoft CEO Steve Ballmer launched a new tool designed to make government spending and revenue more accessible to the average citizen.

The website — USAFacts.org — has been slow and buggy for users on Tuesday, apparently due to the level of traffic. It offers interactive graphics showing data on revenue, spending, demographics and program missions.

Reference: Former Microsoft CEO Launches New Tool For Finding Government Data

Donald Trump’s win was predicted by Allan Lichtman — the US election expert who has called every result since 1984

Political analyist concluded ‘Hillary doesn’t fit the bill’ partly because she lacked Barack Obama’s charisma. Allan Lichtman, a political analyst who has correctly predicted the results of every presidential election since 1984, correctly foresaw that Mr Trump would be the 45th US President.

Unlike many experts who fixated on Mr Trump’s controversial campaign when assessing the election outcome, Professor Lichtman’s calculations largely focused on the incumbent party’s potential for another victory based on 13 key assessments. The system entails “mathematically and specifically” measuring the performance of the party in office. It is a historically based prediction system. He derived the system by looking at every American presidential election from 1860 to 1980.

One of his keys is whether or not the sitting president is running for re-election, and right away, [the Democrats] are down that key.

Another one of his keys is whether or not the candidate of the White House party is, like Obama was in 2008, charismatic. Hillary Clinton doesn’t fit the bill.

Check out the articles below for details of his calculation:

 

Winner and Looser of 2016 US Presidential Election

Well, this morning, everyone should already know the winner and looser of the 2016 US presidential election. So, based on the spirit of this blog, let’s look at the winner and looser in infographic and predictive analytics as a result of the 2016 US presidential election.

Winner: Google Election

Google made it so easy to understand election status and results in one simple page with tabs for overview, president, senate, house, .., etc. The bar between the two presidential candidates is so clear that we know who is winning and who is catching up, as well as how many electoral votes are needed to win.

google-dashboard

I especially like the semi-transparent red and blue to indicate the majority stakeholders by states and the status of swing states in the lower part of the dashboard. In contrast to what Google is doing, other media does not use the semi-transparent colors for the remaining states so it becomes less clear who would win the rest of the electoral votes and is difficult to see the trend.

nbc-dashboard

On the other hand, the bar between the candidates in the Google dashboard diffuses the bias introduced by the area chart of the US map (implying large areas having more electoral votes).

The only item I would add to the Google election dashboard is to apply the semi-transparent colors to the bar between candidates as well. This would make the dashboard perfect.

In summary, Google election dashboard does a excellent job for the US presidential election. It brings clarity to both the status and trend of the election results in a very precise manner. It deserves to be the winner of BI dashboard design for this election.

Looser: Predictive Analytics

Predictive analytics does a very poor job in this presidential election. All predictive models consistently say Mrs. Clinton would win the election over Mr. Trump. As we all know this morning, the prediction is a total failure.

538 is a pretty popular site about predictive analytics. The following image shows you its prediction during the night of the election results. The forecast had been in favor of Mrs. Clinton until 10 PM, when the curve started to switch to the favor of Mr. Trump. And, the switch did not become evident until 11:30 PM (at the big gap where red line above blue line in the lower part of the chart), when some people could already tell the trend before the forecast trend.

538-forecast

However, we probably should not blame predictive analytics for such big failure. It is because the strength of predictive analytics is to predict “major trend”, not a single outcome; and most of our predictive analytics today rely solely on “data” and nothing else.

As I pointed out in my recent blog on my site and KDnuggest site, if predictive analytics is purely based on data without understanding the underlying process, its forecast is subject to noise and bias in the data and could be very inaccurate. This becomes evident during this presidential election. Because all data were bias towards Mrs. Clinton, it predict Mrs. Clinton to win.

In addition, since the presidential election result is a single outcome and involves a lot of human factors which cannot be quantified analytically, predictive analytics may not be the right tool for the prediction at all! The totally failed prediction makes predictive analytics the looser of this election.

Predictive analytics still works well in a controlled context, but may not be the right tool for election prediction unless we are able to (1) quantify human factors and correlations accurately, (2) do not depend solely on data, and (3) fully disclose the prediction errors.

 

How can Lean Six Sigma help Machine Learning?

Note that this article was submitted and accepted by KDnuggest, the most popular blog site about machine learning and knowledge discovery.

I have been using Lean Six Sigma (LSS) to improve business processes for the past 10+ year and am very satisfied with its benefits. Recently, I’ve been working with a consulting firm and a software vendor to implement a machine learning (ML) model to predict remaining useful life (RUL) of service parts. The result which I feel most frustrated is the low accuracy of the resulting model. As shown below, if people measure the deviation as the absolute difference between the actual part life and the predicted one, the resulting model has 127, 60, and 36 days of average deviation for the selected 3 parts. I could not understand why the deviations are so large with machine learning.

lss_ml_1

After working with the consultants and data scientists, it appears that they can improve the deviation only by 10%. This puzzles me a lot. I thought machine learning is a great new tool to make forecast simple and quick, but I did not expect it could have such large deviation. To me, such deviation, even after the 10% improvement, still renders the forecast useless to the business owners. This forces me to ask myself the following questions:

  • Is machine learning really a good forecasting tool?
  • What do people NOT know about machine learning?
  • What is missing in machine learning? Can lean six sigma fill the missing gap?

Note that machine learning, in general, targets two major categories of problems: unsupervised and supervised learning. My article here focuses on a supervised learning problem by using the regression analysis of machine learning.

Lean Six Sigma

The objective of the Lean Six Sigma (LSS) is to improve process performance by reducing its variance. The variance is defined as the sum square of the difference between actual and forecast of the LSS model. This is the definition used in classical statistics.

The result of the LSS essentially is a statistical function (model) between a set of input / independent variables and the output / dependent variable(s), as show in the chart below.

lss_ml_2

By identifying the correlations between the input and output variables, the LSS model tells us how we can control the input variables in order to move the output variable(s) into our target values. Most importantly, LSS also requires the monitored process to be “stable”, i.e., minimizing the output variable variance, by minimizing the input variable variance, in order to achieve the so called “breakthrough” state.

lss_ml_3

As the chart below shows, if you get to your target (center) alone without variance control (the spread around the target in the left chart), there is no guarantee about the target you have achieved; if you reduce the variance without getting to the target (right chart), you miss your target. Only by keeping the variance small and center, LSS is able to ensure the process target is reached with precise precision and with a sustainable and optimal process performance. This is the major contribution of LSS.

lss_ml_4

Machine Learning (ML)

For supervised machine learning, it looks at a function between a set of input variables and output variable(s) to come up with an “approximation” of the ideal function, as shown by the green curve below.

lss_ml_5

Similarly, for unsupervised machine learning, it looks for a function which best differentiate a set of clusters.

lss_ml_6

Comparison between LSS and ML

It is well known that, due to bias and normal randomness, a process is subject to be random in nature; i.e., a process with variance. Therefore, both classical statistics and LSS have shown that, if input variables have large variance, we would expect large variance of the output variable(s).

lss_ml_7

This would strongly suggest the inaccuracy of the machine learning model, when input variables have large variance. This is why, I think, my recent machine learning project has such large inaccuracy in its prediction, and also the reason why the data science consultants can improve the accuracy only up to 10%.

People may argue that the machine learning does have a step called data cleansing to improve the quality of prediction. Well, the problem is that the data cleansing of ML is not the same as the variance reduction of LSS. In LSS, people would go back to examine the business process to find the source of variance of the input variables in order to eliminate the bias or reduce the variance of those input variables (factors), whereas, in ML, people do not go back to revisit the business process; instead, people in ML only try to correct data errors or eliminate data which do not make sense. As a result, such data cleansing approach does not actually reduce variance; actually, it may not change the input variance at all. Therefore, the ML model is not expected to work well, if people do not understand the role of variance.

As an example, if the left chart below represents the data points after data cleansing, we would get the red curve as the optimal ML. But, if the right chart below represents the data points after variance reduction, the resulting ML model would be much accurate.

lss_ml_8

In summary, I think the current data cleansing of ML model needs to include the variance reduction technique of LSS in order to have an accurate, reliable, and effective model for either supervised or unsupervised learning. People need to spend effort to review underlying business process to reduce input variance to make it work better for real world problems.

Software vendors and data science consulting firms should embrace the variance reduction technique in the data cleansing phase of ML to deliver real value of ML.