Using DAX Magic For Variable Forecasting

March 27, 2018 at 11:05 am

I like the use of
Work Day: =
SWITCH ( [Day Number of Week], 1, 0, 7, 0, 1 )
and understand that it is for the Middle East, but why the 7? Do they work compressed hours that day?

April 3, 2018 at 8:57 am

Thanks Stephen. This formula is used in conjunction with [Day Number of Week] to figure out weekends similar to NETWORKDAYS or NETWORKDAYS.INTL in Excel. In some countries, weekends are Fri/Sat, which can affect the number of billable days.

March 27, 2018 at 3:08 pm

I’ve been waiting for the day to see this kind of simulation mindset leak into Power BI.

I thought it held a lot of promise.

Unfortunately, by design, I don’t think this solution is scalable. Imagine having instead of 1 year, 3 or 5, maybe 10 years. And 100s of assumptions that feed into measures built upon measures. And of course, if you have to give choice to the user to see how results are sensitive to the inputs of those assumptions, then you need wrap those assumptions in what-if scenario parameters.

Because measures are calculated in runtime, this becomes a very costly operation. Good luck changing any assumption parameters to other values. Or even applying different filters.

Another way of avoiding this issue, is to create intermediate calculated tables (only available in Power BI and not excel, but work arounds with power query can work). The problem with this is that slicers can’t affect calculated tables be they are not evaluated at run time. The only possible way to have a calculate table hold all your results for a simulation, that is also adjustable by parameter is to literally calculate every possible outcome from all variables. As you can imagine, these tables becomes exponentially big. 1 row for every day in your date table * every possible value for parameter 1 * every possible value for parameter 2 etc.

I tried building an entire financial model once inside PowerPivot. Suffered from performance issues ultimately and had to revert to classic excel models.

I’m not saying it’s not doable. But it’s a long ways away from achieving interactive, complex, scalable solutions.

Unless the technology changes… (dynamic table generation for example)

April 3, 2018 at 9:18 am

Bob, thanks for this! It’s one of the most useful comments I’ve ever received on DAX. I too really (really!) want this approach to be more scalable. I’m currently piloting a project using four or five variables (and multiple years) and the performance is passable–I’m still playing with it. In the course of developing that project, I have used this on the fly and for much simpler analysis, so it’s still proved extremely useful to me.

I’m interested in a couple comments you made if you’re willing to elaborate:

1. Can you tell me what you mean “by design”?
2. Can you elaborate on how you use intermediate calculated tables and also the PQ “work-arounds”?
3. Have you ever found a hybrid Excel-PQ/DAX model useful? I’m thinking about going this way to take care of the more costly calculations.

March 28, 2018 at 1:00 am

Hi Matthew –

Very cool post, I like that you’re flexing your comfort zone with Power Query! I didn’t expect a fully disconnected data model, but you’ve done some awesome things with it.

From a performance standpoint, you may want to be careful using SUMX() in each of your measures. It’s hard to test performance in DAX Studio on your data model (because it’s so small), but looking at your measures from a 10,000 foot view, here’s what I found:

[Num of Working Days] iterates over the entire calendar table…twice. One because FILTER() will scan each row to determine if it passes the Start Date / End Date condition (which it needs to calculate for each row), and then scans each of those passing rows again as part of the SUMX(). With a 1 year table, this is scanning 365 rows. This will happen very quickly, but it’s going to compound as we’ll see.

[Total Working Days] iterates over the DataInput table, and each time it does so it calls [Num of Working Days]…which scans the Calendar table twice. A 10 row table is then going to scan the Calendar 10 times.
I understand the need to have row specific start and end dates that feed into the [Num of Working Days] measure, so I think SUMX() is unavoidable here.

[Labor Cost] iterates over the DataInput table, and each time it does so it calls [Total Working Days], which iterates over the DataInput table again, and in turn will iterate over the Calendar table twice… Now we’re scanning DataInput 100 times and scanning the Calendar table 100 times.

I know that the most recent versions of the Vertipaq Engine have made some efficiencies to both SUMX() and FILTER() to cache certain queries to improve performance, but I would hesitate to use these measures as is in a production environment. At scale, these measures will not perform very well.

A more efficient version of the first measure would be:

Number of Working Days :=
VAR StartDate =
MAX ( DataInput[Start Date] )
VAR EndDate =
MAX ( DataInput[End Date] )
RETURN
CALCULATE (
SUM ( ‘Calendar'[Work Day] ),
‘Calendar'[Date] >= StartDate,
‘Calendar'[Date] <= EndDate
)
This stores the Start Date and End Date values as variables that are only calculated once, allowing you to remove the FILTER() argument (since you can check a column against a scalar value using just CALCULATE()). Since you're only summing up 1 column, SUM() can replace SUMX().

I would leave the [Total Working Days] measure as is.

Your third measure could be optimized as follows:

Labor Cost :=
CALCULATE (
SUMX ( DataInput, DataInput[Hourly Cost] * 8 * [Number of Working Days] )
)

Since the [Total Working Days] measure just sums up all working days for each row of the DataInput table, we can just call that [Number of Working Days] measure inside of the same row context and we'll get the same result, without having to nest SUMX()'s.

I'm interested to hear some more details of how you used this for forecasting in models! Maybe a followup post with more details on some of your use cases?

Cheers,

~ Chris

April 3, 2018 at 9:23 am

Thanks for this comment, Chris–very helpful. I’m trying to educate myself on the Vertipaq engine so I can more quickly recognize efficient vs. costly approaches. Agreed on X functions–I always use them with trepidation. Marco Russo provided some commentary on my first post that helped me personally understand them better.

April 3, 2018 at 9:28 am

Also, Chris, thanks for the feedback on VAR specifically. Not using VAR is an old habit that I’ve been trying to break (successfully even). I can’t remember if I left it out for simplicity sake or because the workbook I used as a template was developed before the habit was broken. At any rate, thanks for bringing it up!

April 11, 2018 at 11:08 pm

No worries Matt, just want to make sure your awesome models run smoothly! Excited for more posts from you.

March 28, 2018 at 7:42 am

Hi Matthew. Thanks for putting this post together. Some great stuff to help a comparative newcomer get to grips with DAX (and M).

I just had one query about the ‘Number of working days’ measure. Should the first filter expression use MIN instead of MAX. Otherwise you’re counting days from the latest start date – ok when the overall filter context returns a single row, but not when multiple rows are returned.

Regards, Ian

April 3, 2018 at 9:30 am

Thanks for the comment, Ian. Did you notice an issue in the result?

April 3, 2018 at 9:49 am

On a second look at your comment and the model, I see your point. If you create a pivot table and drop in [Number of Working Days] I think one sees the result. This measure is meant to be a stepping stone for [Total Working Days] which “fixes” issue by iterating [Number of Working Days] over the DataInput table (line by line).

March 28, 2018 at 3:57 pm

Can you explain why the total working days measure is needed? It seems as though the first measure, number of working days is returning the same result.

April 3, 2018 at 9:51 am

If you create a pivot table and drop in [Number of Working Days] and compare to the [Total Working Days], there is a different result on the total line. The [Number of Working Days] measure is meant to be a stepping stone for [Total Working Days] which “fixes” issue by iterating [Number of Working Days] over the DataInput table (line by line).

April 23, 2018 at 4:42 am

Why use SUMX instead of simple SUM? The Filter function won’t work with simple SUM?

July 17, 2018 at 7:56 pm

Hi, Fantastic. Thank you so much. We have “Estimated hours” instead of “Hourly costs”. We intend to report on billability %age per consultant or work load per consultant per week based on start dates and end dates. Assumption is 30 hours per week is 100% billable. Can I use your suggestion for this too?

May 22, 2019 at 6:49 pm

Hi Matt
How did you get the calendar relationship to work so that the pivot table displayed all the month workdays. I made the relationship with the start date and it only displays July and the total work days. It does not display August through to June 🙁
Regards James

Using DAX Magic For Variable Forecasting

Forget bending spoons with your mind – there’s no money in it.

Cancel reply