Arithmetic Mean vs. Geometric Mean in Power BI
In this blog post, we will explore the difference between the Arithmetic Mean and the Geometric Mean with some practical examples in Power BI using the built-in DAX function geomean.
Table of Contents
Arithmetic Mean
The arithmetic mean, also known simply as the “mean” or “average,” is the most common measure of central tendency. It is calculated by summing all the values in a data set and then dividing by the number of values. In DAX, you can use the AVERAGE function to compute the arithmetic mean.
Example:
Let’s say we have the following daily sales figures for a retail store:
Day 1: $100 Day 2: $200 Day 3: $300 Day 4: $400
To calculate the arithmetic mean in Power BI, we would use the following DAX formula:
ArithmeticMean = AVERAGE('Sales'[DailySales])
The arithmetic mean of these sales figures would be $250 ($100 + $200 + $300 + $400) / 4.
Geometric Mean
The geometric mean is another measure of central tendency that is calculated by multiplying all the values in a data set and then taking the nth root, where n is the number of values. It is useful when comparing values with different units or measuring the average rate of return on investments. In DAX, you can use the GEOMEAN function to compute the geometric mean.
Example:
Now, let’s imagine we have a data set showing the percentage growth rates of a company’s revenue over four years:
Year 1: 5% Year 2: 10% Year 3: 15% Year 4: 20%
To calculate the geometric mean in Power BI of these growth rates, we would use the following DAX formula:
GeometricMean = GEOMEANX('GrowthRates'[YearlyGrowthRate])
The geometric mean of these growth rates would be approximately 11.07%.
Arithmetic Mean vs. Geometric Mean
If we were using the arithmetic mean of these growth rates it would be 12.5% (5% + 10% + 15% + 20%) / 4 =12.5%
ArithmeticMeanGrowth = AVERAGE('GrowthRates'[YearlyGrowthRate])
In that case, using the arithmetic mean would be misleading, this is because the arithmetic mean assumes that the growth rates are additive rather than multiplicative, meaning it does not take into account the compounding effect of growth over time.
Geometric Mean and the outliers
Imagine a dataset representing the annual percentage growth rates of subscribers for an online platform over five years:
Year 1: 10% Year 2: 20% Year 3: 30% Year 4: 15% Year 5: 300% (a viral marketing campaign led to an exceptionally high growth rate this year)
Now let’s compute the Arithmetic Mean and the Geometric Mean and we also add the Median.
The arithmetic mean: is greatly influenced by the extreme value (300% growth rate in Year 5) and, as a result, is much higher than the median and geometric mean. This makes the arithmetic mean less representative of the general trend in this case.
The median: even though the median is usually robust with respect to outliers, the problem is that the Median doesn’t just ignore the extreme values it also ignores the values of everything, except the middle element so the Median might not work well with small data sets.
The geometric mean: which is less sensitive to extreme values than the arithmetic mean, provides a more accurate representation of the central tendency in this case. And it also takes into account the compounding effect of growth rates over time, making it the most appropriate choice for measuring the average growth rate in this example.
Limitation
- Due to its complex mathematical nature, the geometric mean can be challenging to comprehend and interpret
- When one or more observations in a dataset have a value of zero, the geometric mean will always be zero, regardless of the other values in the dataset.
- If some of the values are negative, it may not be possible to compute the geometric mean
Calculate the geometric mean by hand
Another way of writing the geometric mean in Power BI is to use the Product function so using the growth rate example the formula will be as follows:
GeometricMean with product =
var _nvalue=COUNTROWS(GrowthRates)
return
PRODUCT(GrowthRates[YearlyGrowthRate])^(1/_nvalue)
Finally, if we want to calculate the geometric mean without using one of the two functions Geomean or Product (which were not available before 2022) we can use the logarithms approach which has the following formula:
GeometricMean with log =
var _nvalue=COUNTROWS(GrowthRates)
var _sumlogs=sumx(GrowthRates,LOG(GrowthRates[YearlyGrowthRate]))
return
10^(_sumlogs/_nvalue)
Both of these formulas will give the exact same result as the built-in function but I wanted to highlight that even when a function is missing we can often write a DAX formula to get the same result.
Conclusion
In this short post, we explore what is the geometric mean and its advantage over the arithmetic mean.
While the arithmetic mean is more commonly used, the geometric mean can provide a better representation of central tendency in specific scenarios, such as analyzing rates of change or investment returns, calculating average compound interest, or when there is a lot of volatility in the data set or any other data with multiplicative properties.