Nested SUMX or DAX Query?

August 6, 2015 at 4:44 pm

That’s a very fine laid out blog post Matt. I particularly like the way that you tune in on the right solution step by step by using Dax Studio. That makes it a lot easier both to follow and understand the process and not the least the logic behind the final solution.

I believe that Dax Sudio is a very much overlooked tool for learning DAX and debugging DAX formulas. That especially goes for people who are working exclusively inside Excel (as opposed to working with SSAS tabular and it’s client tools).

Very fine blog post.

August 6, 2015 at 5:06 pm

Thanks for your comments and support. I agree with your comments about DAX studio, and also about writing DAX. I think it is good for everyone learning DAX to realise that there is a process to get the right answer, and there are very few brainiacs that can just pull the right answer out the first time.

November 16, 2015 at 11:01 am

Does anyone know a source for learning a little more about how to utilize the DAX Studio? This is about the best blog I’ve found with an example of someone using the software and the documentation at https://daxstudio.codeplex.com/documentation really wasn’t very helpful. Ferrari has a good video on optimizing DAX Queries, but it is using SQL Profiler and SSMS for the analysis which is still way over my excel brain’s comprehension. https://www.youtube.com/watch?v=1v0xUX8Bve8

November 16, 2015 at 12:25 pm

I think the issue with the lack of content and documentation is that DAX studio was built for SQL Server pros that are using Power Pivot. These people already know how to use the profiling tools (and query language) and hence they don’t need the documentation. For the rest of use mere mortal Excel users, we have to hunt, scrounge and teach ourselves. This sounds like a good thing to add to the list of potential blog posts.
I know there is a post at TinyLizard https://tinylizard.com/dax-studio/

August 6, 2015 at 6:13 pm

Thanks for sharing your path to DAX efficiency.

October 14, 2015 at 9:34 am

Hello, I have the following problem:

My table has Prices, Products, Dates and Stores. Something like a price history.

What I need is to find latest price for every store (provided the price falls into given filter Date context) and return the average for such latest prices.

LASTDATE and LASTNONBLANK don´t seem to work because without Store context they always return latest available date and Stores where I dont have price for that date are skipped.

Thank you

Martin

October 15, 2015 at 12:18 am

Hi Martin

It is difficult to give advice without seeing the data model. I suggest you prepare a sample workbook that has data in the structure you are using, and then post the question along with the sample workbook on https://powerpivotforum.com.au

October 15, 2015 at 5:18 am

Hi Matt, I did as you said, thank you.

August 1, 2017 at 9:19 pm

Hi Matt, I’ll try to give it shot to put in practice what I have learned from you, the italians and Rob.
Hi Martin, because you need the latest price for every store and the average price for all stores, the following DAX query computes only those values which you can see in a pivot table with Store on Rows, AveragePrice on Values (LastPrice for every Store) and the AveragePrice for all Stores in the Total row. I have not taken into account the Products column given that you don’t need average prices by product. You can test this DAX query on DAX Studio and use it to create the measure “AveragePrice”. If you use a Slicer to filter the pivot table by Dates it will show you the last price for every store and the average of all stores that had sales on that date. If you don’t filter by dates you will have the last date and average price of the last date that each store had sales. Hope this is what you requested.

EVALUATE
ROW(“AveragePrice”,
AVERAGEX(
ADDCOLUMNS(
SUMMARIZE(
Data,
Data[Store]
),
“LastDate”,LASTDATE(Data[Dates]),
“LastPrice”,LASTNONBLANK(Data[Prices],1)),
[LastPrice]))

November 18, 2015 at 5:34 pm

I followed your pattern on Same Store Sales to create a model that compares CY and LY employee hours only for instances where CY & LY are both 0 for a given month for data that is already aggregated at a month level. When using the ADDCOLUMNS and SUMMARIZE pattern, I got an error that I doubt you would encounter with store sales. In my employee hours data, sometimes an employee’s hours will drop to zero in the midst of non-zero hours in the months before and after (the employee goes on leave or something similar). I get the non-contiguous dates error in the calculation of the LY measure. I believe this is due to non-contiguous dates being passed back to the CALCULATE function by the filter as the date for the zero-month data is abruptly dropped out. I wonder if you get the same error if you happen to change a the sales from one store to zero in a period where there are sales before and after (i.e. all of the sudden, sales just drop to zero). I’ve been struggling to find a workaround. I have not tried your other SUMX patterns presented here, but wonder if any simple solutions come to mind.

November 19, 2015 at 6:58 pm

Robert

Hard to help you without the entire picture. I am only aware of non-contiguous issues when using inbuilt time intelligence functions such as TOTALYTD. If this is the issue, then switching to a GFITW solution (search on Rob’s site) should fix the problem. Otherwise feel free to create a sample workbook and post at https://powerpivotforum.com.au to see some specific help.

January 13, 2016 at 8:51 am

Thanks for sharing the insight Matt. I am however facing a problem while trying to calculate the same store growth rate (SSG) i.e. % of increase (or decrease) of current year sales over last year.

If i use the nested SumX method the table is filtered properly i.e. only those rows are visible where there was sales in current year as well as previous year. However the calculation in to row total is summation of individual SSG for months rather than the difference between the like to like sales of the 2 year.

If i use AddColumn & Summarize method then the row total problem is solved, however the data in SSG column also includes rows (i.e months) which did not have sales (either in current year or previous year or both years). The calculated value is -1 (or -100%).

How to solve this issue to show appropriate %age difference between the 2 columns created above (

January 13, 2016 at 5:12 pm

Hi Amit

How about you create a sample workbook showing what is currently happening and what you want at https://powerpivotforum.com.au

January 14, 2016 at 12:56 am

Hi Matt,

Thanks for the suggestion. Have uploaded a sample file on the https://powerpivotforum.com.au/viewtopic.php?f=6&t=270

June 4, 2017 at 7:38 pm

Hi
Just came across this article. I haven’t tried yet, but I think it will fit very well for a particular need I have. Regarding DAX studio, I tried to install it but it requires to pieces of software. It offers to download them, but then it fails (probably the links are old). Even if I install the software required (SQL Server 2016 ADOMD.NET and SQL Servver 2012 Analysis Management Objects, it still indicates that SQL Server 2012 Analysis Management objects is missing.
If somebody knows how to go around that, it will be greatly appreciated.

David

June 5, 2017 at 2:52 pm

Have you tried the support at daxstudion.codeplex.com?

June 5, 2017 at 6:24 pm

Hi Matt

Thanks, yes I reached out to them and the versions for the program needs to be the 2016 version. Downloaded it manually and then was able to install DAX Studio.

Regards

David

September 3, 2018 at 4:27 am

Hi Matt,

Thanks for this blog, really easy to follow and the exact issue I had last week (if only I’d read this then!!).

I was hoping you could clarify your final comments with respect to the time taken to execute the three options. Two of the options I understand:
1. Nested SUMX – It will first look at 1 row in the dates file and iterate over the stores x 500. Repeat 730 times for each date = 500 x 730.
3. SUMMARIZE – It’s just logical, the table is small, the iteration will be tiny.

But… I’m confused on 2. SUMX with Filter. From what I understand, the filter will be executed first and the SUMX second. In this instance
1. Filter is executed, running over 730 lines to validate the filter criteria. Let’s assume this leaves 200 lines that match.
2. SUMX iterator is applied. I thought that this SUMX would be calculated for the 200 lines remaining after the filter.

So I would expect that the number of iterations for SUMX with Filter would be 730 + (200 x 500) rather than 730 + 500 as you mentioned. Any advice you can offer on where my logic has gone wrong would be very much appreciated!

Thanks
Kirsty

September 3, 2018 at 5:41 pm

This is a complex topic and I am constantly learning and deepening my own understanding. The inefficiency as I understand it is more about the the smaller buckets of execution with SUMX vs the single vertipaq query with SUMMARIZE. While conceptually you can consider iterators like SUMX and FILTER as working row by row, in reality that is not what happens (provided the formula has not been inefficiently written). Generally these iterators will still use vertipaq engine processes to complete the tasks more efficiently than the conceptual “row by row” evaluation suggests. There is an article here (that I haven’t read yet) that may be helpful/insightful. https://www.sqlbi.com/articles/optimizing-nested-iterators-in-dax/

July 31, 2019 at 10:08 am

Hi,
I would like to get sum by Attribute Value (the left clm). The measure i’ve created doesn’t make it. Where do i wrong??
# test_sumx = CALCULATE(
SUMX(
GROUPBY(QM1_Fact_MeasuresDetails,
[Misgeret_Id],[Date_ID_Date_Of_Stay],
“MaxCap”,
MAXX(CURRENTGROUP(),[Misgeret_Max_Capacity])
),
[MaxCap]),
VALUES(QM1_Fact_MeasuresDetails_UPVT[Value])
)
There is a tablix looks like:
Value Value # test_sumx
A 6100 95
B 161005 756
C 161008 4428
C 161009 3600
D 206100 684

And I would like to achive:
Value Value # test_sumx
A 6100 95
B 161005 756
C 161008 8028
C 161009 8028
D 206100 684

QM1_Fact_MeasuresDetails_UPVT and QM1_Fact_MeasuresDetails_UPVT_Series are duplicated tables and bidirectional link to QM1_Group_Ids and to QM1_Fact_MeasuresDetails, because of many-to-many relationships.

If there is a way to attach print screens, it can be easier.

P.S: The following return the correct answer, but it’s respond time is very slow, and possiable only on filtered population.
# test_sumx2 = CALCULATE(
SUMX(
SUMMARIZE(QM1_Fact_MeasuresDetails,
[Misgeret_Id], [Date_ID_Date_Of_Stay],
“MaxCap”,MAX(QM1_Fact_MeasuresDetails[Misgeret_Max_Capacity])
),[MaxCap]

)
,VALUES(QM1_Fact_MeasuresDetails_UPVT[Value])
,CALCULATETABLE(ALLSELECTED(QM1_Fact_MeasuresDetails_UPVT_Series))
)
Please help me!!!

Nested SUMX or DAX Query?

The Root Problem – Same Store Sales

Pivot Table 1

Pivot Table 2

How the Power Pivot Engine Works.

F: Double Nested SUMX

G: FILTER and a single SUMX.

My journey to find the right DAX Query formula.

H: SUMMARIZE, ADDCOLUMNS and FILTER

So why is the Query approach better?

Cancel reply