February 25, 2024

Calculate Different Averages with fnMegaAverage()

The tween finishes Dad with the epic finishing move, “Dad, you’re mega average!”

I was doing some training on a simple concept: averages. We mostly deal with then arithmetic mean, which is the sum of all observations divided by the count of all observations. It is simple and very easy to understand.

But there are others. Sometimes, the average does not make sense, because the data is skewed by outliers. Often, people divert to the median, which is in basic terms the middle value in the dataset (for even counts it is the average of the two values closest to the center after sorting).

But it gets even more complicated after that. There is also the Winsorized mean, which is replacing the values below the +/- nth percentile with those percentiles. There is also the trimmed mean, which removes outliers before providing the mean.

Of course, I tried to figure this out with Power Query.

The Code

Here is a link to the gist.

The function returns all of the means mentioned above with the median in a record. This makes for easy comparing between all the different means.

For the Winsorized mean, I used 5th and 95th percentiles as parameters, which makes it so that any values outside of those percentiles (less than 5th, greater than 95th) are replaced with the closest parameter.

For the trimmed mean, I used 1.5 multiplied by the inter-quartile range (IQR) as the basis for trimming. If a value is less than the first quartile minus 1.5 * IQR or if a value is more than the third quartile plus 1.5 * IQR, it is removed from the mean calculation.

Example

Just refer to the table and column like so and you will get a record with the values:

Is This Helpful?

Yes, I believe so. Outlier analysis is a helpful thing to do and figuring these things out in Power Query can be easy. I also like functions I can use and reuse, even if they don’t ultimately end up in the data model.

You may also like