 A Power BI Technique Mined from the Power Pivot Archives

Below, you will find one of our all-time favorite Power BI techniques. In the ten (yes ten) years we’ve been operating this site, we’ve written over 1,000 articles on Power BI.  Hard to imagine - even for us - but true.

Years ago, we first wrote up this technique in the context of Power Pivot – Power BI’s cousin and predecessor, which we like to call “Power BI in Excel.”

Since the two products share the same brains (DAX and M), this technique is more relevant today than ever. Everything below is 100% legit for Power BI – the same data model “shape,” the same DAX, etc. – but now you can use it to power up your Power BI visuals, and not just PivotTables.  Enjoy! Another Question from the Mr. Excel Forums

Got a question on the forums the other day.  It took some extra twists and turns but the simplest version of the question is worth covering here:  how do I perform a weighted average?

Let’s say you have some data about ZIP (postal) codes in the United States: And you build a simple pivot that shows total population and median age for each ZIP, grouped by that last column, which is how fast the population of that ZIP is growing: Simple SUM Measure for Population, Simple AVERAGE for Median Age

But that “Average Median Age” measure is just:

[AVERAGE MEDIAN AGE]=
AVERAGE(Zips[Med Age])

which treats all ZIP codes as equal, even if they have wildly different populations: The Median Age of Very Low-Population ZIP Codes (Like Pop=277)
is Counting Equally to Higher-Pop ZIP Codes (Like Pop=21484)

Correcting so that Median Age is Weighted by Population

Let’s write a new measure:

[Average Med Age Weighted by Pop]=
SUMX(Zips, Zips[TTL Pop] * Zips[Med Age]) / [Total Population]

And then view the results: PowerPivot Weighted Average Measure Compared to Non-Weighted Average

And that is what we’d expect – since the smaller ZIP codes of population 277 and 427 both have very young median ages, they were “unfairly” skewing the subtotal younger.  When overall, fast-growing ZIPs have a lower median age once you weight them for population.

Even cooler – I can remove the ZIP code field from the pivot and the weighted average measure still works: Weighted Average Measure Works Even When
the “Child” Field (ZIPCode) is not on the Pivot

For more information on SUMX(), see this post.

Rob Collie

One of the original engineering leaders behind Power BI and Power Pivot during his 14-year career at Microsoft, Rob Collie founded a consulting company in 2013 that is 100% devoted to “the new way forward” made possible by Power BI and its related technologies. Since 2013, PowerPivotPro has rapidly grown to become the leading firm in the industry, pioneering an agile, results-first methodology never before seen in the Business Intelligence space. A sought-after public speaker and author of the #1-selling Power BI book, Rob and his team would like to help you revolutionize your business and your career.

This Post Has 18 Comments
1. David Hager says:

Very useful. Thanks!

2. Nick says:

Rob, you’re the best! I’ve been trying to figure this out for a couple months, but I gave up and came up with a really tortured duct-taped back-door sumproduct-type solution. This is just what I needed.

I’m always learning something useful from this site – thanks!

1. powerpivotpro says:

That’s awesome Nick, thanks 🙂 Really happy to hear things like this.

3. ratio says:

How did you come by [Total Population]

1. powerpivotpro says:

It’s a simple measure formula: SUM(Zips[TTL Pop])

Is that what you are asking?

4. Ted says:

I cannot get the result as my calculation still shows the simple average in the pivot table. Is there something else to the pivot table that is necessary? Should the weighted calculated field be summarized by “average” in the pivot table?

1. powerpivotpro says:

Hi Ted. No, you shouldn’t have to change anything, the formula should just do the right thing.

Can you post the formulas you are using for the various measures? That might not be enough for me to diagnose but it’s a start.

Also please tell me what fields you have on your pivot (on rows/columns/slicers).

1. Ted says:

I got your you response to Ratio working correctly on my fact table. So, I have a follow up question. I also have a dimensions table with a 1:many relationship to the fact table.
This is the formula:

{percent agreement with the survey statement=
averagex(distinct(dim_data[Respondent ID]),fact_data[av_wgt]/
sum(dim_data[Final Weight]))
where:
AVERAGEX function used for my data in place of SUMX
[Respondent ID] is the link to the dimensions table (same result if this is from the fact table)
[av_wgt] is the binary response times the weight ([Final Weight])

My Pivot Table looks something like this:
Row Labels Sum of av_wgt Sum of Final Weight Sum of Pct_Agree
Q1A1 813870.32 1156587.76 0.704
Female 813870.32 498125.09 0.704
Male 813870.32 658462.67 0.704
Q1A10 440769.14 1156587.76 0.381
Female 440769.14 498125.09 0.381
Male 440769.14 658462.67 0.381
GENDER is from the dimensions table, and the Q1An values are from the fact table (giving the 1:many relationship). Q1A1 and Q1A10 are correct, but the sum of Pct_Agree does not change when GENDER from the dimensions table is added as a row label; same for Sum of av_wgt. How do you get the rollups to change with the dimensions table fields?

Thanks for you help.

1. Ted says:

Sorry about the Pivot Table format. Pasting this table into Excel may help:

Row Labels, Sum of av_wgt, Sum of Final Weight, Sum of Pct_Agree
Q1A1,, 813870.32, 1156587.76, 0.704
,Female, 813870.32, 498125.09, 0.704
,Male, 813870.32, 658462.67, 0.704
Q1A10,, 440769.14, 1156587.76, 0.381
,Female, 440769.14, 498125.09, 0.381
,Male, 440769.14, 658462.67, 0.381

2. powerpivotpro says:

Is fact_data[av_wgt] a measure or a column?

BTW when referencing a measure in a measure formula, I always omit the table name, and when referencing a column, I always include it.

EX: [Measure] and Table[Column]

That way I can always quickly tell the difference.

You may already be doing that of course, but it was a relevant tip to share just in case.

1. Ted says:

fact_data[av_wgt] is a measure, and I changed it to [av_wgt] per your suggestion. It does not change in the Pivot Table when GENDER (a dimension variable) is added to the rows.

2. powerpivotpro says:

Well I was really hoping that was a column, so I could suggest you use a measure instead 🙂

Any chance you can send me the workbook? Even a version where the data has been replaced with fake data, or sensitive identifying columns removed?

I’m rob. At a place called pivotstream. dot com.

5. Karen McGinnis says:

Please HELP! I am trying to calculate forecast accuracy by product type for my national sales meeting. I have a pivot which has all of our items grouped by their product type. The Row labels are Product type / item; the column headers are 1st Qtr Forecast, 1st Qtr Invoice. I did a simple calculated feild to come up with the % of each invoice to forecast. (new column 1st qtr Inv vs Fcst)

I want to do a weighted average when I sum this up to cagetory so that if an item that sells 10K units is only 60% accurate and 3 items that only sell 1K units are 100% accurate the average would be weighted based on what they sell.

I did a 2nd calculated feild that gave a weight of each item based on the total of the category and that works. I just need to know how to write the formula so that it takes the weight I just created and uses it with the % Inv to Fcst.

6. Christopher Hill says:

Any insight as to how to calculate a Weighted Standard Deviation in DAX? For example, using standard Excel tables (assuming table=T, Values=[Value], and Weight=[Weight]), to calculate a Weighted Standard Deviation you could use the following formula:

=SQRT(SUMPRODUCT((SUMPRODUCT(T[Value],T[Weight])/SUM(T[Weight])-T[Value])^2,T[Weight])/(SUM(T[Weight])-1))

The main crux of the issue I am running into is of course DAX doesn’t have a SUMPRODUCT() formula, but rather SUMX, which changes the context. The first part of the formula:

SUMPRODUCT((SUMPRODUCT(T[Value],T[Weight])/SUM(T[Weight])-T[Value])^2

Effectively is calculating the variance (e.g. how far each row varies from the mean), however in DAX, I had to change the sumproduct()’s to sumx()’s, which in turn, modify the context so the ‘mean’ it is calculating is based upon the single row, opposed to all rows (within the filter). While a simple ALL() will return it to the TOTAL table context, this also ignores any slicers/filters that may be applied.

Thoughts?

7. Bob says:

How would [Total Population] be written if for example Group A is sliced out?
Cause then you don’t want to divide by the whole popluation, but only the (sum of the ) population of Groups B till H.

8. Sushil says:

Rob, Thank you very much. It my first time to make my hand dirty with DAX and I got it right with you.

Got reasonable idea to build my logics. Very good reference point. these few building blocks one need to understand and rest shall follow on your skill.

9. Brandon West says:

OMG I have been scouring the net for a solution to my weighted avg problem and tried ridiculous complicated formulas/tables/measures… but yours finally fixed my problem. Thanks so much!

1. Rob Collie says:

Our pleasure 🙂