In statistics, a Poisson distribution is a probability distribution that is used to show how many times an event is likely to occur in a fixed interval of time or space. In other words, it is the distribution for the counts of events that occur randomly in a given interval of time. It was named after French mathematician Siméon Denis Poisson. In this post, I’m going to describe how to implement the Poisson Distribution in Power BI with DAX.
The Poisson distribution can be applied in many different domains such as finance, biology, healthcare, retails, etc…
Some real life examples of the Poisson distribtuion
- Check for adequate customer service staffing: Calculate whether customer service staffing is enough to handle all the calls without making customers wait on hold.
- Number of Arrivals at a Restaurant: Estimate the chances of having more than than 100 people visiting a particular restaurant
- Number of bicycles Sold per Week: If the number of bicycles sold by a bike shop in a week is already known, then the seller can easily predict the number of bicycles that he might be able to sell next week and thus better managuing his stock.
To illustrate the use of poison distribution in Power BI I will use the bike shop as an example.
Before jumping to the implementation of Poisson distribution in Power BI let’s have a look at the data.
Each week the seller reorder 30 bicycles to restock his inventory and he’s selling on average 32 bicycles each week.
Due to a shortage of stock, he is losing business opportunities every week but on the other hand, he does not want to overstock the shop owner wants to maximise his sales while still keeping the optimal stock.
In order to be more competitive, the seller only wants to be able to fulfil at least 95% of the sales each week.
At the moment he’s been able to reach 95% of fulfilment only 9 weeks out of 18 weeks.
In order to determine the most optimal stock to order on a weekly basis, we can use the Poisson distribution.
The Poisson distribution
Like in Excel there’s a built-in function for the Poisson distribution in Power BI POISSON.DIST
As we can see the function is expecting 3 parameters: the number of occurrences “x”, the “mean” and the logical value that determines the form of the probability distribution.
- True for the cumulative distribution function (CDF): probability that the number of random events occurring will be between zero and x inclusive
- False for the probability density function (PDF): probability that the number of events occurring will be exactly x
Poisson non cumulative distribution formula
- λ is the mean
- x is the number of occurrences (x=0,1,2,3…)
- e is Euler’s number (e=2.71828…)
- ! is the factorial function
(Remember that there’s already a built-in function for the Poisson Distribution in Power BI I put the Dax code for educational purposes only)
poisson non cumulative formula = var __euler=2.71828 var __lambda=calculate([avg Sales wk],all('Stock Poisson')) var __x=min(x[Value]) return divide((__lambda^__x)*(__euler^-__lambda),FACT(__x))
Poisson cumulative distribution formula
poisson cumulative formula = var __euler=2.71828 var __lambda=calculate([avg Sales wk],all('Stock Poisson')) --x[value]=0,1,2,3... (61 in my example) var __x=GENERATESERIES(min(x[Value]),MAX(x[Value]),1) return sumx(__x,divide((__lambda^[Value])*(__euler^-__lambda),FACT([Value])))
Visualize the Poisson distribution in Power BI
In order to plot the distribution, we can create a calculated table that contains the Poisson distribution CDF and PDF.
x = var __StartValue = 0 var __EndValue = ROUND([avg Sales wk]+8*[STD Sells wk],0) var __IncrementValue =1 var __value =GENERATESERIES(__StartValue,__EndValue,__IncrementValue) var __avgCalls=[avg Sales wk] return ADDCOLUMNS(__value,"PoissCum",POISSON.DIST([Value],__avgCalls,TRUE),"PoisNonCum",POISSON.DIST([Value],__avgCalls,FALSE))
Now let’s visualize the Poisson Non Cumulative distribution, as we can see the closer to the mean we are the highest the probability is. We can also observe the probability of selling less than 20 bicycles is very low as well as selling more than 40 bicycles.
Now if we take a look at the Poisson Cumulative distribution we can see that the probability of selling less than 40 bicycles is around 90% and the probability of selling less than 50 bicycles is around 99.9%.
If you remember the initial problem we want to find the optimal stock that ensures the shop fulfil 95% of the sales for each week. It looks like the stock should be somewhere between 40 and 50.
Calculate optimal stock using Poisson distribution in Power BI
Let’s first create a dynamic parameter that allows us to play with the % fulfilment to reach.
Stock Analysis = GENERATESERIES(0.90,1,0.01)
Inverse Distribution Function
Since we want to always be able to fulfil 95% of the sales each week we need to find the minimum stock to reorder every week to ensure that we reach 95%.
So in order to calculate the optimal stock, we need to use the inverse function distribution (IDF) of Poisson, at the time of writing this post there’s not any Poiss_Inverse built-in function available in DAX and neither in Excel but this is pretty simple to write our own function.
To calculate the inverse function of the Poisson distribution we simply need to find the smallest integer N such that POISSON.DIST(x,mean,cumulative=TRUE) >= Prob
min lambda = --Here we retrieve the Prob from the dynamic parameter created above var __prob=FIRSTNONBLANK('Stock Analysis'[% Fulfilment],1) return --Then we find the smallest integer N >= __prob (95% in our case) CALCULATE(min(x[Value]),round(x[PoissCum],4)>=__prob)
In other words, to calculate the optimal stock, we simply take the minimum value of the poison cumulative distribution that is greater or equal to the desired probability to reach which gives us 42.
Poisson probability in Power BI
To better understand the different probabilities that we can calculate with the Poisson distribution.
I wrote three other DAX measures as follows:
Exact probability = [Poisson Non Cumulative] Less or equal probability = [Poisson Cumulative] at least probability = var __minX=min(x[Value]) return 1-calculate([Poisson Cumulative],x[Value]=__minX)
- The exact probability is given by the poisson distribution
- E.g. Probability of selling exactly 42 bicycles = 1.54%
- The less or equal probability probability is given by the cumulative distribution
- E.g. Probability of selling 42 or less bicycles = 96.2%
- The at least probability is ovtained by substrating the cumulative distribution from 1
- E.g. probability of selling at least 42 bicycles = 3.8%
Visualize the Poisson distribution functions together
We already saw above the two distribution functions PDF and CDF now let’s plot the three functions PDF, CDF and IDF.
As we can see the Inverse function is simply the inverted cumulative distribution and they both cross each other at the peak of the non-cumulative distribution which is obviously the mean of the weekly sales.
The Power BI
In this post, we saw that using the built-in DAX function it is quite simple to implement the Poisson Distribution in Power BI.
This post was just another shallow introduction to statistical concepts such as the one I wrote about Ab Testing with Power BI. It is, of course, impossible to cover all the aspects of the Poisson distribution in just a post as there are entire books written on this subject with more advanced concepts which are beyond my knowledge.
However, I wanted to show that we can easily implement the Poisson distribution in Power BI and obtain the same result that we would with any other more advanced statistical software.
Of course, we can always take advantage of R or python script in Power BI to get this done but unless we want to achieve something that is not possible to do with DAX like clustering in power bi using R I’d highly recommend using DAX only as much as possible.