Browsed by
Category: Statistics

Arithmetic Mean vs. Geometric Mean in Power BI

Arithmetic Mean vs. Geometric Mean in Power BI

In this blog post, we will explore the difference between the Arithmetic Mean and the Geometric Mean with some practical examples in Power BI using the built-in DAX function geomean. Arithmetic Mean The arithmetic mean, also known simply as the “mean” or “average,” is the most common measure of central tendency. It is calculated by summing all the values in a data set and then dividing by the number of values. In DAX, you can use the AVERAGE function to…

Read More Read More

Multiple Linear Regression in Power BI

Multiple Linear Regression in Power BI

In this post, I will describe how to implement Multiple Linear Regression in Power BI using DAX only. In the February 2023 release, Power BI introduced a new function called “Linest“, so we will see how to use it to make predictions and interpret its result. Linear Regression Linear regression is a type of statistical analysis used to find the relationship between two variables. It is used to determine how one variable (dependent variable) is related to another variable (independent…

Read More Read More

Optimized median measure in Dax

Optimized median measure in Dax

In this post, I describe how to write an optimized median measure in Dax which under specific criteria can be 1,000 times faster than the built-in median function.Some time ago I was tasked to migrate a multidimensional cube to a tabular model, this cube had around 2 billion rows and I also had to create a median measure on the new tabular model. I first thought that the median would be much faster to calculate on a tabular model than on…

Read More Read More

Paired T-test in Power BI using DAX

Paired T-test in Power BI using DAX

In this post, I will describe how we can implement a paired t-test in Power BI using DAX only. What is T-Test A t-test is a type of inferential statistic that can be used to determine if the means of two groups of data are significantly different from each other. In other words, it tells us if the differences in means could have happened by chance. There are three types of t-test: An Independent Samples t-test compares the means for two groups. A Paired sample t-test compares means from…

Read More Read More

Univariate Statistics DAX Cheatsheet

Univariate Statistics DAX Cheatsheet

In this small post, I’m sharing and will keep up to date the list of the most common univariate statistics DAX functions available in Power BI, as of now most of them are built-in functions but the more complex ones still require writing some Dax code. I’m sharing this through a Power Bi app where we can simply copy/paste the DAX code of the selected function. The Power BI

Correlation Coefficient in Power BI using DAX

Correlation Coefficient in Power BI using DAX

In this post, I will describe what is the Pearson correlation coefficient and how to implement it in Power BI using DAX. What is the Correlation Coefficient The correlation coefficient is a statistical measure of the relationship between two variables; the values range between -1 and 1. A correlation of -1 shows a perfect negative correlation, and a correlation of 1 shows a perfect positive correlation. A correlation of 0.0 shows no linear relationship between the movement of the two variables. How…

Read More Read More

Skewness and Kurtosis in Power BI with DAX

Skewness and Kurtosis in Power BI with DAX

In this post, I will describe what Skewness and Kurtosis are, where to use them and how to write their formula in DAX. What is Skewness Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the centre point. For a unimodal (one mode only) distribution, negative skew commonly indicates that the tail is on the left side of the distribution, and positive skew…

Read More Read More

Clustering in Power BI using R

Clustering in Power BI using R

Since 2016 there’s a built-in feature in Power BI that allows us to automatically find clusters within our data.This is a great feature, however, its main drawback is that whenever we add new data into Power BI the clusters need to be manually recalculated for the new data. In this post, I will show how we can implement clustering in Power BI using R and automatically recalculate the clusters whenever we hit the refresh button. What is Clustering Clustering is the…

Read More Read More

AB Testing with Power BI

AB Testing with Power BI

A/B testing, or split testing, is a digital marketing technique that involves comparing two versions of a web page or application to see which performs better.

In this post, I am going to share a step by step guide to implement AB Testing with Power BI.

Visualising Deviation From Average in Power BI

Visualising Deviation From Average in Power BI

In this post, I illustrate how to implement Deviation From Average in Power BI and why we should use it. Margin is a key metric to assess high-level performance of a company. But sometimes we want to measure and compare specific shop or department or employee performance with the entire company overall performance and this where deviation from average metric comes to help. The requirement are as follows: Company overall margin over time (which is just the margin average) Employee…

Read More Read More

T-Test: Dr. Semmelweis and the discovery of handwashing

T-Test: Dr. Semmelweis and the discovery of handwashing

This article only illustrates the use of t-test in a real life problem but does not provide any technical information on what is T-Test or how T-Test works. I will go through the T-test in details in another post and will link it into this post. Intro I was looking for a cool dataset to illustrate the use of T.test and I found this DataCamp project “Dr. Semmelweis and the discovery of handwashing”. This a straightforward project but I really…

Read More Read More

Central Limit Theorem -example using R

Central Limit Theorem -example using R

The Central Limit Theorem is probably the most important theorem in statistics. In this post I’ll try to demystify the CLT with clear examples using R. The central limit theorem (CLT) states that given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the original population. Furthermore, the CLT states that as you increase the number of samples…

Read More Read More