Post by Rob Collie

First, a Few Quick Updates…

1) Just a reminder that enrollments are open for the live, in-person two-day classes taught by me (Rob) in Indianapolis and Washington DC (in October and November, respectively.)  Sign up now – those dates will be upon us sooner than we all realize!

2) The pre-order for DAX 2nd Edition is going Great!  Thanks to everyone who has pre-ordered and/or contributed.  PRE ORDER HERE if you have not already, and don’t forget the exclusive swag rewards like the sticker, poster, and t-shirt.

3) Ola Freaking Rosling is now using Power BI, and threw us a “shout-out!”  OMG, check out this tweet.  I nearly peed myself:


If you don’t recognize the name, his dad is Hans Rosling, the speaker in the first TED talk I ever watched:


Click Image to View the Talk

…In which he did the animated bubble chart demo that I’m 100% positive was the inspiration for Amir Netz adding animated bubble charts in the next two product releases Smile

Here’s Ola and Hans co-presenting on the state of world health and social justice, via data of course:


Again, Click to View the Talk

A celebrity in that Google-y, Apple-y, Silicon Valley-y “doing good with data in the public interest” space using Power BI and saying that Power BI is awesome, and that he’d previously been scared off by all the anti-MS propaganda…  well that is simply awesome.

But I won’t lie – that he gave US a mention in that tweet – I *do* find that to be even cooler.  I’m human.  I can’t help it.  I smile every time I think about it.

Slaying the “White Whale” of Variable-Rate Forecasting with PRODUCTX


“Arrrrr!  We Be Meeting Again!”

OK OK…  time to circle back to one of my only “defeats” EVER, and settle the score!  Yes, this problem…  well it beat me up pretty badly about 18 months ago.  I eventually solved it, but not in a way where the formula was even remotely explainable or maintainable.

I’ve done some posts about exponential growth forecasting before.  In both of those posts, I used the POWER() function to multiply out the increasing or decreasing series.  (10 percent growth each year, for five years translates to 1.10 raised to the 5th power).

But both of those posts assume the growth factor is STATIC.  Meaning, if you forecast 10 percent growth per period, it’s 10 percent growth for EVERY period.  That nice, steady percentage allows us to use the POWER function to handle all the multiplication.

Those were easy mode, in other words.  But then, later, I wrote another post that handled variable rates.  I leaned heavily on Jeffrey Wang from Microsoft for that one, because DAX didn’t have a PRODUCTX function at the time.  That was some complex mathmagic-land stuff – converting a multiplication series into a sum by using logarithms and exponents, but it got the job done.


This was harder:  Each year had its own specified growth rate.
(But STILL…  that rate applied to EVERYONE for that year – my next example is worse)

Still, though, even THAT was easy mode compared to the problem I encountered a year later!

What if your growth rates are still variable, but you have multiple populations of customers moving through the forecast, and as those populations “age,” we must look up their growth rate based on how long ago we acquired them?  Meaning, for a given calendar year’s forecast, each group of customers (grouped by their “age”) has to be treated a bit differently?

Here, it’s probably easier to show you:


The Left Hand Column is No Longer a Calendar Year – It’s a Customer’s “Age!”
(Oh, what’s that you say?  That table looks funny?  It’s Power Pivot in Excel 2016!)

In short, THIS problem was a farkin disaster.  The crazy “logarithms, exponents, and SUMX” workaround to create our own version of PRODUCTX was hard enough on its own, but when I THEN had to “nest” THAT contraption of a formula inside ANOTHER crazy series of loops, well, it would have given Obi-Wan one of those terrible disturbances in the Force.  It was an unsatisfactory result.

Now We Have PRODUCTX, for Realz!

With Excel 2016 and Power BI Desktop, though, we now DO have a real PRODUCTX function, so that “inner” complexity is removed.  At least in theory.  So I’ve been meaning to circle back around and give it a shot.

In short…  I beat this sucker.  It was still a bit chewy, but seriously, 20x easier than the last time I tried it (without benefit of PRODUCTX).

Here’s the model:


Three Tables, No Relationships.

image imageimage

The Three Tables:  Intentionally Small, But this Technique Will Work With “Bigger” Data

Description of the Problem

Let’s describe the problem in English, using the tables above as a reference:

  1. Each year from 2008 through 2015, we have a “known” number of new customers who “walked in the door.”
  2. We also have research that tells us that one year after we acquire a customer, there’s a 70% chance that customer is still around.
  3. Any customer who “survives” that first year then has a 60% chance of sticking around for the following year.
  4. We actually get a “bump” in customer loyalty in year three – if you’ve stuck with us that long, chances are good that you REALLY like us, and we’ve found we retain 80% of such customers for the next year.  (Of course, that’s 80% of 60% of 70% of the original incoming customers, so it’s still a minority in the bigger picture).
  5. But then, customers tend to become “satisfied” or “weary.”  For whatever reason, customer retention falls off after that, and in year 4 we only keep 50%, followed by 30%, 20%, 10%, and ultimately 0%.
  6. So, if we want to project how many customers we’re going to have “in the system” for year 2019, we have a very complex problem.  Each of the 2008-2015 populations must be “aged” differently, and then all of those results must be blended together into a unified projection.  Yikes.

Diagram of the Problem

If we want to forecast total customers for 2016, it would look like this:


If We Add X + Y + Z, We Get Our Answer
(There are actually other populations involved, too, Like 2012 and 2011, But The Diagram Was Getting Too Big.  So it would actually be more like U + V + W + X + Y + Z, but you get the idea.)

OK, Enough Setup.  Here’s the Resulting Pivot.


See?  Done.  That Last Column is the Final Measure.

The Formulas

From right to left in the pivot above:

Total Projected Customers:=
[Continuing Customers]+[Customers Acquired]

Customers Acquired:=
CALCULATE(SUM(Incoming[New Customers]),
FILTER(Incoming, Incoming[Year]=MAX(Years[Year])))

Continuing Customers:=
SUMX(FILTER(Incoming, Incoming[Year]<MAX(Years[Year])),
Retention[Year After Being Acquired] <=
MAX(Years[Year])-Incoming[Year] ),
Retention[Retention Rate] )
*Incoming[New Customers])

OK, so that third formula was the hard one.  But WAY easier than before, when we didn’t have PRODUCTX!

Oh???  You want me to EXPLAIN it???  Sheesh folks, it’s 1 am here and I’m catching a taxi at 6:30 am.  I’m gonna be a zombie tomorrow as I fly from one side of the country to the other.

Seriously, I’m just gonna have to take a rain check for tonight. Maybe I’ll explain, tomorrow, in the comments Smile

  Subscribe to PowerPivotPro!


Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology. 

This Post Has 12 Comments

  1. is it possible to add one more wrinkle to this to make even more kick ass? Different customer types (cohorts) that have different retention rates. Suspect that in the model it would mean several different of the “retention” and “incoming” tables. A simple, if overly so, example would be males vs females that have different rebuy rates by year. A more realistic example would be customers acquired via different “channels” (eg online vs in store retail, or discount v full price, etc). You could then draw a straight cause/effect line to strategic marketing investment planning.

      1. No sooner than I hit submit, I publicly shamed myself (basically my dog was in the office) for laziness in asking before trying. Couldn’t one just replicate the base measures [Customers Acquired2], [Continuing Customers2] etc and then feed them all into the [Total Projected Customers]. Shall try it anon!

        Catch up would next week would be great. I’ll PM you…

  2. Just to show a different answer, we had a similar problem in projections and we ultimately solved it by materializing summarized crossjoins as tables in SSAS. In this case we could do a simple transformation to your ‘Retention Rate’ column so that rather than being percent of previous column it becomes % of original Value:

    1 0.70 70%
    2 0.60 42%
    3 0.80 34%
    4 0.50 17%
    5 0.30 5%
    6 0.20 1%
    7 0.10 0%
    8 0.00 0%

    From there build the table we need using CROSSJOIN, then add a column of new customers multiplied by our modified retention rate for each future year. Then we can SUMX all of the same years, and the FILTER it so that we only see the Year for the Years[Year] in the pivot:


    This may be more difficult to conceptualize, but is a particularly powerful technique in SSAS where you materialize the table here, thus making performance much faster in large-data environments.

  3. Rob, I am working on a project that we are working with weeks ago a sale was made against when the lead was first assigned to a sales person instead of years after being aquired. Similar concept but a bit more granular. Would this work off a date dimensions table using a date_values column or would I need to create something like a weeks ago from now in my date dimensions table and query that with DaxStudio or something similar? Just thinking about how you would handle different time frames considering generally you are not stringing weeks and months together as a continous field to act against.

    Hopefully this is not all gibberish.

  4. Hi Rob! This is awesome! I think I understand how this is working. Hope you shed more light on this though – would love to hear your explanation.

  5. I agree – this is AWESOME. This is perfect for projecting enrollments at a university. Just what the doctor ordered. Now, if I could only get my hands on Excel 2016 I would be in great shape!

  6. With SSAS you can make the timing dynamical. With mutiple calendar tables and specific formulas. Because I need this the be possible to pivot per day or week. is there a solution for Power Bi?

Leave a Comment or Question