Let’s create a story line in order to understand Laplace Smoothing correctly.
A sports coach (SC) has a team of 10 players from P0 to P9 ( check in Figure 1 ) to train for upcoming tournament. SC wants each of the player to practice hard. SC is absolute intolerant to cheat on practice session, so SC applies players’ scores on a certain equation in each game. The equation is given below,
∏( score of player_i / total_score ) ≥ λ , where, λ is some predefined threshold … Equation(1)
Q: Why SC is using the weird equation?
Let’s refer the given scenario in Figure 2,
Now, while using the equation(1) when contribution from P3 is multiplied with the recursive product result of P1 and P2, it drastically falls down as P3 is contribution only 1.32% of the total score. SC is easily able to detect P3’s cheating in practice.
But what if P3 scores 0 ?
The whole recursive product goes to 0. In case of any player after P3 if cheating also, won’t get caught. That’s where Laplace Smoothing comes into picture. Laplace Smoothing will force other players to share a small portion of their score to P3 or any player if scores 0, so that the recursive product of Equation(1) doesn’t reduce to 0.
The formula used by Laplace Smoothing is given in Figure 3,
V : Number of unique observations
c(i) : Outcome of i-th observation
N : Cumulative outcome of V observations
K : Tunable smoothing parameter.
Figure 4 represents the contribution scores after applying Laplace smoothing.
Note that contribution score of P3 is not 0 anymore after applying Laplace Smoothing as well as contribution score of few player has decreased by a small margin. Now SC can review the practice performances without a headache.
Conclusion: Laplace smoothing helps to assign minimal distributive value for unknown tokens.
Hope you have enjoyed the reading. Feel free to criticize and suggest.