Use Time intelligence functions to do a running sum of the last 6 months with PowerPivot

July 6, 2010

By Kasper de Jonge

A while ago I did a PoC using PowerPivot, both to show Self Service BI with PowerPivot for SharePoint but also as a datasource for SSRS reports. I used DAX to solve all difficult request, mostly making use of the Time Intelligence functions.

One of the requests was if we could show a running sum of the last 6 months. I had not done that before, so today we take a look at how to do this with DAX. It appeared to be rather easy :)

Again we use the almighty Calculate function to change the context of the row we are in. We want to do a sum of all the rows of the last 6 months of data. We use the DATESINPERIOD function to get the last 6 months of dates.

The function DATESINPERIOD has 4 parameters, first is the column containing the datarange, second is the start date. We can get it by getting the last date in context, we use LASTDATE to get this. As last two parameter we can give intervals to subtract or add from the start date. In our scenario we want to subtract 6 months from the last date.

This gives the following formule:

=IF(ISBLANK(sum(FactInventory[DaysInStock])),BLANK(),
CALCULATE(sum(FactInventory[DaysInStock]),
DATESINPERIOD(DimDate[DateKey],
LASTDATE(DimDate[Datekey]),-6,MONTH)))

Again not too hard :)

I put in a ISBLANK to check if we have values for the current month, we are not interested in the future :)

This gives us the following result:

I decided to check out the DATESBETWEEN as well, the DAX function below gives the same result:

=IF(ISBLANK(sum(FactInventory[DaysInStock])),BLANK(),
CALCULATE(sum(FactInventory[DaysInStock]),
DATESBETWEEN(DimDate[Datekey],
FIRSTDATE(DATEADD(DimDate[Datekey],-5,MONTH)),
LASTDATE(DimDate[Datekey]))))

The DATESBETWEEN function can be used to return a table of dates between a start and a end date.
As end date we need to get the last date that is available in the current context. We can determine this by doing LASTDATE(DimDate[Datekey]). The first date is a little more confusing. We need to get the date 6 months ago, of course we can use the DATEADD function. This will return us a a set of dates of the current context, in our sample we get a set of all dates in the month. Using firstdate we get the first date of this set to start our datesbetween function.


Use slicer values in a calculation with PowerPivot DAX

June 17, 2010

By Kasper de Jonge

A while ago i had a question on my ask a question page, Sasha wanted to do the following:

How to define the date I want to see as “open items”.

E.g. Open Items by 25.05.2010
Item 1: Posting Date: 20.05.2010 Clearingdate: 28.05.2010 Value 100 EUR
Item 2: Posting Date: 22.05.2010 Clearingdate: 27.05.2010 Value 200 EUR
Item 3: Posting Date: 23.05.2010 Clearingdate: 24.05.2010 Value 300 EUR

Result will be 300 EUR (Item 1 + 2) Postingdate 25.05.2010 OR Clearingdate = 00.00.0000).

But how to select the 25.05.2010 for this calculation. I have not a date like 25.05.2010 in my Pivottable?

We have two challenges here:

  1. Create a between like measure using dax
  2. Get values from a slicer to be used inside this calculation, the data from this slicer mustn’t effect the data inside the pivottable

I have created the following simplified scenario to recreate sasha’s question. I started with a dataset:

I want to get the sum of amount where my slicer is between values a and b. The first thing i want to solve is how to get a value from a slicer inside my calculation.

To do this i created a new dataset with the values i wanted to use in my slicer, since i might want to use a value that is not inside my powerpivot data. I loaded this into PowerPivot:

This made sure of two things: i have the data i want inside a slicer and because i didn’t create a relationship between this data and the fact table nothing will happen when i select data from the slicer.

Because the data from the slicer isn’t connected to my main fact table doesn’t mean we cannot get data from it. The slicer will make sure the table i just loaded will be “sliced” to the value we want.

When we create the pivottable we seen the following:

Because we didn’t create a relationship PowerPivot keeps reminding us a relationship might be needed, because this is pretty annoying we can turn this off by clicking on the Detection button:

So we are good to go, to get the value of the slicer we can now do sum(aantal[aantal]) in our measure. A measure that uses the value from the slicer would be:

Slicer measure=if(COUNTROWS(aantal) = 1, sum(aantal[aantal]), BLANK())

This will check if we have only one value selected, and if that is the case return the sum of the column. This results in the single value because the table has one row (it is sliced to one row). Otherwise return BLANK(). This will look like:

We now have the value of the slicer in our pivottable and we can use it to create a measure that will give us the sum of amount where the value of our slicer is between a and b.

What we are going to do use sumx to sum Table1[amount] over a filtered table, this table will filter the values where the value of slicer aantal between column a and b. This will look like:

=IF(COUNTROWS(aantal) = 1 ,
			SUMX(FILTER(Table1,
				Table1[a] <= sum(aantal[aantal])
				&& sum(aantal[aantal]) <= Table1[b]
			)
			,Table1[amount])
	, BLANK())

step by step:

  • Check if we have selected only one value in our slicer, if more return blank
  • Do a sum of Table1[amount] over a filtered table, more on sumx at this blog post of Rob Collie.
  • Use filter to return a dataset of specific values from a table, in this case return all rows where value of column a <= value of the slicer and value of the slicer <= value of column b

This will look like this in our Pivottable, as we can see we only have the rows where our slicer measure is between a and b:

When we remove a and b from the pivottable we see the total sum:

This again shows you the amazing power of DAX, a lot is possible. But a word of caution is at his place, SUMX and FILTER are two of the most CPU consuming functions in PowerPivot, it creates a new dynamic in memory table for every cell in the pivot where FILTER is used. I tried a similar function at a Pivottable based on Contoso and it took me a lot of CPU. Check out this blog post of Rob Collie on the use of filter. SUMX and FILTER are amazing giving you all kinds of possibility’s but you need to think where to use what as you can read in Rob’s blog post.


PowerPivot time intelligent functions: why use ALL() and how to work around it

June 3, 2010

Posted by: Kasper de Jonge

Last week I got a most excellent question from Sasha at my question page. Using my timeintelligent function screencast Sasha created a workbook using  YTD  where he used one table with facts and dates in it. The problem he had was when created his time intelligent function he wasn’t able to use data from other columns than the date columns. In this blog post i´ll try to describe how the time intelligent functions work, what pitfalls are and how to solve them.

To be able to use a time intelligent function in DAX you use the Calculate function to group measures by a filter. With a time intelligent function you want to filter your values over a period of time (like YTD or previous month).  Most of the time you want the use these functions inside a pivottable where you use dates on the x or y-axis, the values inside the pivottable would show values per the current period context. This would logically result in the following DAX formula:

Dates YTD = CALCULATE(sum(Tablix1[nroforders]),DATESYTD(Tablix1[Date]))

This writes out to: you want the sum of Tablix1[nroforders] from the first Tablix1[Date] value of the year to the Tablix1[Date] belonging to the Tablix1[nroforders] in pivottable context. The YTD of the Tablix1[nroforders] from march 2009 would mean we need to take the sum of Tablix1[nroforders] from all rows from the start of year to march 2009, in the image below you can see a sample of the values that will be summed:

But when we add the formula to the measures and use it in a pivottable we see something strange:

As you can see the nroforders and the YTD formula result in the same values … this is not what we expected. The reason is because the time intelligent function requires an additional parameter, in a blog post at PowerPivotpro.com the PowerPivot product team gave the following answer to my question to why the result is not as expected:

When using Time intelligence functions like LASTDATE or DATESMTD, it is necessary to add another argument which is ALL (TimeTable). Without this only the selected dates are considered, and you can’t find the last month unless you are in the last month.

As you can see in the screenshot this is indeed what happens, the function only uses only the current date context in the sum.

Ok so we need to use ALL to get the results we want, this will result in the following syntax:

DatesYTD w All =CALCULATE(sum(Tablix1[nroforders]),DATESYTD(Tablix1[Date]),all(Tablix1))

This is indeed the result we expected, but having to use the ALL() function has a huge downside.  The ALL() function according to BOL:

Returns all the rows in a table, or all the values in a column, ignoring any filters that might have been applied. This function is useful for clearing filters and creating calculations on all the rows in a table.

This means when you use a filter or slicer this is being ignored by the time intelligent function that uses ALL(). As you can see below the results of the YTD are the same as unsliced:

There are two methods we can use to work around this problem:

  1. When you know what slicers and filters you want to work with you can use an ALL() alternative: ALLEXCEPT(). With ALLEXCEPT you can pass through “a list of columns for which context filters must be preserved”.
    In our case we would like to be able to slice on country. This would look like:
    DatesYTD w AllExcept =CALCULATE(sum(Tablix1[nroforders]),DATESYTD(Tablix1[Date]),allexcept(Tablix1,Tablix1[country]))
    I would say use this option if you already know what you want to slice on and don’t have much time to solve it properly.
  2. The proper and most flexible is option number 2. To be able to slice / filter on all the columns you can think of you should create a separate time table. This isn’t very user friendly and your end users will have a hard time grasping this. A few options i can think of to create this time table:
    1. Import the fact table with distinct on date columns (i hope your table isn’t too big ..)
    2. use excel to copy the date rows, remove duplicates and create a linked table (new dates won’t be added)
    3. Supply your users with a default time table in SharePoint/SSRS, they can import this by using the data feed option, use your DWH datetime table as source.

you should create a relationship on the datetime column between the fact table and the the imported time table (make sure your datetime columns have identical granularity, like year, month, day, otherwise the join wont find results). With this relationship in place you now are able to use ALL over the datetime table. When you use ALL() over the separate time table it no longer ignores filters / slicers over your  fact table.
The function would look like:
DatesYTD w All TT =CALCULATE(sum(Tablix1[nroforders]),DATESYTD(DateTable[Date]),all(DateTable))
You now are able to filter or slice all the columns from the fact table you want. In the RTM version of PowerPivot the ALL() in the DAX function is no longer required, the separate time table unfortunately still is.

The final workbook now looks like:

My conclusion is that Time intelligent functions still are a very powerful feature but i am really disappointed on its user-friendliness, while these workarounds are easy for IT/BI personnel to grasp and implement, end users will have a harder time implementing this. Maybe MS can implement a “add date time table” button in PowerPivot to automatically create a date time table to our PowerPivot tables to make it a little easier for end users to implement time intelligent functions.


Building a cash flow statement in PowerPivot using dynamic measures in DAX

June 1, 2010

Posted by Kasper de Jonge

I got an excelent question at my “Ask a question page” on my blog. Greg asked me if i knew of a sample for a cash flow statement in PowerPivot. I did not know of one, and to be honest i didn’t even know what a cash flow statement was :).

Searching for a sample i found this picture of a cash flow statement:

This made things more clear. We want to see income and expense in two different tables. And then in a new table the cash flow statement where we can see the starting cash at the start of a month, income and expense in the month and the ending cash with the values at the end of the month.

I got really excited to solve this using PowerPivot, so I decided to build a sample myself. First i had to create a fact table that contains the values of the income and expense, i used the picture from above to create sample data:

As you can see we have income and expense for days in three months. To determine if a specific activity is income or expense, i created a secondary table that groups my activities:

We can create a relationship between the two tables so we can put them in a table and calculate a measure with income – expense.

Next i created a separate related time table to use in time intel. functions:

We create relationships between the date from the facttable and the new date table

Now we can create the first table putting income and expense in one table per month using only the relationships between the tables and no DAX:

Now for the interesting part, how do we create the cash flow table.

What we really need is a way to create 4 different calculations per row for each month on column, as we can see in our sample we need to have the following measures:

  1. Starting cash= the total ytd of (sum of disbursements – sum of receipts) until the previous month
  2. Receipts = sum of Receipts at the current month
  3. Disbursements = sum of Disbursements at the current month
  4. Ending cash = the total ytd of (sum of disbursements – sum of receipts) until the current month

These measure will translate to dax as the following:

  1. TOTALYTD(Activities[Sum of Value](‘Group’[group] = “Receipts”) – Activities[Sum of Value](‘Group’[group] = “Disbursements”),DATEADD(‘Date’[Date],-1,MONTH))
    subtract values from group disbursements from values of the group receipts for the totalytd until the previous month
  2. Activities[Sum of Value](‘Group’[group] = “Receipts”)
    values from group  receipts in the current month (current column context)
  3. Activities[Sum of Value](‘Group’[group] = “Disbursements”)
    values from group disbursements in the current month (current column context)
  4. TOTALYTD(Activities[Sum of Value](‘Group’[group] = “Receipts”) – Activities[Sum of Value](‘Group’[group] = “Disbursements”),’Date’[Date])
    subtract values from group disbursements from group  receipts for the totalytd until the end of the current month

But how can we do this in PowerPivot ? We need to determine what to calculate per row .. to do this we can use dynamic measures in DAX, as in this excellent post from MSFT Howie Dickerman.

First we need to be able to put values on the rows on which we can base our measures, to do this i created a new table in excel with rows and loaded it into PowerPivot:

We now can put these row labels against months in a new pivottable:

Now we can create a measure which uses the current row context to determine what to calculate at the measure. When we use the VALUES function we can see what the current row context for our Cashflow[CashFlow]  is and use this in an IF  statement in the measure to calculate a different measure for each row.

The formula will look like:

= IF(COUNTROWS(VALUES( Cashflow[CashFlow])) =1,
IF( VALUES(Cashflow[CashFlow]) = "1 Starting Cash",  TOTALYTD(Activities[Sum of Value]('Group'[group] = "Receipts") - Activities[Sum of Value]('Group'[group] = "Disbursements"),DATEADD('Date'[Date],-1,MONTH)) ,
IF( VALUES(Cashflow[CashFlow]) = "2 Receipts",  Activities[Sum of Value]('Group'[group] = "Receipts"),
IF( VALUES(Cashflow[CashFlow]) = "3 Disbursements",  Activities[Sum of Value]('Group'[group] = "Disbursements"),
IF( VALUES(Cashflow[CashFlow]) = "4 Ending Cash", TOTALYTD(Activities[Sum of Value]('Group'[group] = "Receipts") - Activities[Sum of Value]('Group'[group] = "Disbursements"),'Date'[Date]) ,
BLANK())))), Activities[Sum of Value])

This will create one measure that depending on the row context shows different calculations.
Resulting the actual cashflow pivottable:
You can see that the values are calculated per cashflow type per month showing a different calculation. I hope this was what Greg meant :)
This example shows again the great power of DAX, a lot of things are possible. I still get really excited about PowerPivot.
The sample file can be downloaded from my skydrive.

New Docs from Redmond!

April 27, 2010

Dax WhitepaperCouple quick things today.

First, Howie Dickerman, one of the SuperHeroes of DAX, has written a new version of the DAX whitepaper.

I had a chance to discuss some of it in person with Howie when I was in Redmond.  It opened my eyes to a number of things I had been missing.  I sadly have not had time to review it in detail yet – there are other things (good things) afoot at PivotStream these days…  like, um, I dunno, our first industrial-strength PowerPivot farm in the sky…  but I plan to get back to DAX shortly.

In the meantime, I have posted the updated whitepaper in the Samples Gallery out on the FAQ Site.

PowerPivot Architecture Poster

 
Also, Denny Lee shared an excellent new poster that displays all the components of the PowerPivot architecture.  I’ve also uploaded that to the Samples Gallery

Note that the poster is HUGE, meant to truly be a poster, so it’s not really something you view in a web page.  Hence the PDF format.  It’s also available in other formats.

 


Six Months With PowerPivot, part one

April 23, 2010

simply_red1

   
“If you don’t know me by now…  you will never never never know me…”

In honor of PowerPivot v1’s public release, I thought I’d offer up my thoughts on my last six months working with the product. 

Remember, I physically left Redmond in August and then officially left Microsoft in February, but have been putting PowerPivot through its paces that whole time, with a wide variety of data sets and business scenarios.

So in some sense, you can’t really get a closer view of PowerPivot than I have, while still maintaining any sense of neutrality about the product.  So here it is, the honest truth…

It passed the Great Football Project Challenge

When I started the Great Football Project back in October, I really did not know what to expect.  Honestly, I wasn’t sure that it was a great idea – was I going to get 5 days in and discover that PowerPivot simply wasn’t suited to that problem?  There was genuine potential for embarrassment.  But I needed something to blog about, and I was anxious to get started, so I just dove in.

After a few weeks, I was still holding my breath a bit.  I was past the basics but hadn’t really pushed the envelope at all.

At some point though I just stopped wondering.  It wasn’t until recently that I looked back and realized that I was building features, like the Rank measure, that were never in the original project at all. 

I can say now with confidence that PowerPivot can absolutely handle the business logic phase of the football project, which the consultant back in 2006 described as one of the most complex cubes he’d ever seen.

I call that a success.  Note how I specified the “business logic phase?”  That brings me to the next topic…

It is NOT a data cleaner/shaper

OK, this is perhaps obvious since the PowerPivot tables cannot be edited, and there are no features/functions that let you reshape rows.  Yes, you can add calc columns and that often does the trick, but there are other cases (like for instance, removing dupe rows) that PowerPivot can’t do at all.

The whole “shape it in normal Excel and then paste it over to PowerPivot window” workaround is ok for one-off work, or for tables that you don’t expect to change very often.  But like many of you, I am striving to gain the benefits of automatic nightly data refresh that the server provides, and the Excel workaround doesn’t translate to that system at all.

So, even more than ever, you need a clean and properly-shaped data source to start with.  So the first half of the work that the consultant did for the football project in 2006, where the text files were imported into SQL and turned into a decent schema, is still very much required.

For a production system, I don’t think this is a bad thing at all.  It forces you to cooperate with IT (or whoever owns your databases) to give you what you need.  And it forces them to listen to you more clearly, as long as they care about “taming the Excel beast” that is always their favorite thing to complain about :)

For a production system, that cooperation is essential for robust results.  And if it’s not a production system, then yeah, the Excel shaping workaround is great.

“No, it can’t do that.  Oh, wait. Nevermind.  Yeah, it CAN do that.”

Having worked on the product from the beginning, I’m more closely familiar with its limitations than most people.  In some ways that’s an advantage of course – I don’t dive into projects only to find out later that it can’t be done.

Oddly though, so far, knowing the limitations has largely just been a hindrance.  Every time my initial answer was “no, not in PowerPivot v1,” I’ve turned out to be wrong.

I have succeeded using PowerPivot to solve every single real-life problem I’ve encountered in my first six months of using it.

Market basket analysis?  Ranking measures?  Standard deviation?  Many to many relationship problems?  Godawful horrible data sources?  Measures that calc according to different formulas at different levels of the pivot?  Iterating over variables that aren’t even in the view?  PowerPivot has defeated them all.  Well, more accurately, I have defeated them all with PowerPivot.  It’s not like I sit back and watch PowerPivot do its thing.  It is not always easy.  Which brings me to the next point:

Challenging and Rewarding

You know those rare occasions where you suddenly find yourself in the fast lane?  When your brain is forced to expand?  When you are truly challenged, in a good way?

I’m talking about a specific kind of challenge, the good kind.  Not the kind like when you take a new job and are overwhelmed by all the new rules and bureaucracy you have to learn.  And not even the kind when you’re learning most new technologies (HTML and XML come to mind).

The best examples of this “positive challenge” vibe from my life are the ones where I’ve been pushed by a mentor.  A couple of teachers come to mind.  Some specific coworkers as well – Zeke Koch, David Gainer, and Amir Netz most prominently.  Being around those guys everyday basically was a wakeup call – “Rob, you’ve been asleep.  Wake up, it’s time to grow, to be excellent.”

PowerPivot, oddly, has felt like that.  My brain has been expanding again, after a period of stagnation.  More specifically, PowerPivot combined with the problems I’ve been tackling has done this.

And it flows over to other areas too.  Example:  years ago when I needed to estimate the incoming query load for the football stats project, I asked around for advice, got none, and then basically just guessed.  It was a very hollow experience.  Non-excellent.

Then recently, I was presented with essentially the same challenge.  But this time, I didn’t guess, I modeled it:  estimated how many reports (and queries per report) each user would exercise at peak, built a spreadsheet, re-taught myself the Poisson and Binomial functions, and voila – a “users per server” estimate I could believe. 

I’m positive that working with PowerPivot is the difference between the “lazy guess” mentality from before and the “it can be done” mentality today.  I love it :)

Carrot, not stick (but sometimes the carrot is too big for one sitting)

I don’t want you to interpret the above as “PowerPivot is a harsh Pai Mei figure, kicking Rob’s ass day in, day out.”

Nope, I worked with PowerPivot for months, and did things I thought were pretty damn cool, without ever stretching my brain the way I have been lately.  If you’re an Excel pro, you will find PowerPivot to be a very welcoming environment.

Excel veterans:  you will never be forced to do anything uncomfortable with PowerPivot.

In fact you will do amazing things in your first week that will actually deliver greater results than standalone Excel, and it will actually seem easier than normal Excel.  Check out the CALCULATE function and you will see what I mean.

But boy, sooner or later, it will TEMPT you to try something bigger.

You’ll be sitting there one day thinking, “Gee, I sure wish I could build an analysis that showed X.”  And then something hits you – “hey wait, I bet THAT’s what that function I looked at the other day does” or “Didn’t I see a blog post last month about something similar?”

Two hours later, you’re still heads-down over your DAX formulas and relationships, feeling like you’re 30 seconds away from a breakthrough that will change your professional life.

You might not even succeed that first day.  You may have to come back tomorrow with a fresh perspective and a clear head.

And you love it.  Every minute of it.

But that’s when you realize that you have left the reservation.  You are not in Kansas anymore.  Time to take off the training wheels.  Pick your analogy.  Make no mistake – the power of DAX in particular can challenge you immensely.  Jon Udell described the examples I’ve covered here on the blog as “magic,” and he’s right – you won’t pick up everything overnight.  You should be prepared for that.

It’s called learning.  And you’ve almost forgotten what that feels like.  It’s what drew you to Excel in the first place, years ago – that fleeting glimpse of capabilities and the results it could deliver, but also frankly because you were hooked, addicted, to mastering it.

That feeling is back :)


Quick Tip: Don’t Over-Use FILTER()

April 7, 2010

The other day I was working on an updated version of my Team Playcalling report, which, by the way, look FABULOUS with Slicers:

NFL PlayCalling in PowerPivot

Only problem with it was that it was taking 90 seconds to refresh.  Unacceptable.

Howie Dickerman and Marius Dumitru, two of the Superheroes of DAX at Microsoft, pointed out that my measures were using the FILTER function in places that were not necessary.

For instance:

  [New Measure] = [Original Measure] ( 
                      FILTER(DataTable[PlayType]=”Pass”)
                                              )

Can be rewritten without the FILTER function:

  [New Measure] = [Original Measure] ( 
                      DataTable[PlayType]=”Pass”
                                              )

When I switched over to using those expressions directly, rather than the FILTER function, my 90 second refresh time dropped to about 3 seconds.

That’s a pretty significant boost in DAX measure performance from a very simple change.

Why so much faster?

The explanation from Howie made a lot of sense.  When the PowerPivot engine is evaluating a measure, it already has to take filter context from the pivot itself – row fields, slicer selections, etc.

And since that’s pretty much the #1 use case for the engine, well, applying filter context to a measure is highly optimized and fast.

So it’s not a big deal for the engine to inject another filter like I have specified in the rewritten example – it gets treated much the same as if that filter came from the pivot, as if a slicer had been set to PlayType=”Pass”

But the FILTER function, on the other hand…  well, it creates a brand-new table in memory.  I know my example just filters by one column, but the FILTER function can do some pretty amazing things, dynamically responding to current context.

Because of that power, the FILTER function must either create or update that dynamic table for every cell in the pivot where FILTER is used.  I have 256 cells in this particular pivot (32 teams times 8 measures), and originally all of them used FILTER.  But given that some of my measures are based on other measures that aren’t displayed in the pivot, my actual cell count was even higher.

Now imagine what would happen in a pivot with 5,000 rows :)


DAX.xml file from Colin

April 5, 2010

Rather than do the email dance forever, Colin has provided a copy of his DAX.xml file (for Notepad++ autocomplete support, as described in the previous post) in the PowerPivotFAQ samples library:

http://powerpivotfaq.com/PowerPivot%20Samples/Forms/AllItems.aspx


Notepad++ and DAX – Guest Post from Colin Banfield

April 5, 2010

 
Force 10 From Navarone

“Excuse me, sir, you don’t need 20 men, you need Miller Banfield. 
He’s an expert with explosives Excel, sir. He’s probably the best in England.”

 

 

That’s right folks, after prowling the comment threads for months now as one of the most insightful contributors, Colin Banfield of BIExcel.com is taking the plunge and sharing a full guest post with us.  This is indeed a very positive development for us all :)

Today, Colin is giving us more info on using Notepad++ to write PowerPivot formulas.  Take it away Colin :)

Using Notepad++ as an editor for creating, reading, and troubleshooting DAX formulas

Rob recently wrote an article about the difficulty he experiences editing complex formulas in PowerPivot, even though he helped to create editor used in the Measure Settings dialog box. You can read Rob’s article here http://powerpivotpro.com/2010/03/27/two-observations/. In the article, Rob noted the following as shortcomings of PowerPivot’s built-in editor:

· Can’t indent formulas
· Can’t save formulas
· No find and replace option
· Can’t change the font size

To address these issues, Rob has been using Notepad as his PowerPivot formula editor. In the above-mentioned article, Rob solicited opinions on alternative text editors. Some folks, including me, suggested Notepad++. Notepad++ is a free text editor that might be viewed as Notepad on steroids. However, the point is not that Notepad++ has more features than Notepad – it’s that you can customize the program to work specifically with DAX formulas. The purpose of this article is to explain how the customization is done, and the many benefits you gain from the effort.

You can download the latest version of Notepad++ from here http://notepad-plus.sourceforge.net/uk/site.htm. During the installation, I strongly suggest that you don’t check the option to bypass using the %APPDATA% folder for storing Notepad++ settings (the option is unchecked by default is used mainly for portable installations). If you have already installed the program and checked the option during installation, I recommend that you completely uninstall the program (including everything under the Notepad++ installation folder) and start over.

Figure 1 shows the Notepad++ main window. The file manager window on the left is a plug-in that you can install through the Plugins menu. The plug-in is called Explorer. The built-in file manager is useful when you want to quickly retrieve saved formula files.

clip_image002

Figure 1 – Notepad++ main window

The first step in customizing Notepad++ is to create a custom language for DAX. The purpose of a custom language is to provide special syntax highlighting for various elements of the language. Choose ViewàUser -Defined Dialogue…The User-Dialog box appears as shown in figure 2.

clip_image004

Figure 2 – User-Defined dialog box

Click Save As… and enter “DAX” (without the quotes) in the Name text box. After you click OK, you will notice that extension (Ext) box appears at the top right of the User-Defined dialog box. Type an extension in this box (txt, dax, foobar, whatever), and save your formula files with this extension. This ensures that when you subsequently open the file, Notepad++ applies the custom language for editing the file. You can associate more than one extension with a custom language by separating the extension names with a space in the Ext box.

After you enter an extension, you can explore the various tabs in the dialog box. My personal setup is as follows (it has changed over time):

· Folder and Default tab – As shown in figure 3.

clip_image006

Figure 3 – Folder& Default tab

The folder open and close keywords are used expand/collapse a block of code. This might be useful for delineating parts of a DAX formula that can be used as intermediate measures in a PivotTable (see figure 7).

· Keyword Lists tab – See figure 4.

clip_image008

Figure 4 – Keyword Lists tab

In the “1st Group” list box, I pasted an alphabetized list of DAX functions.

· Comment and Number tab – See figure 5.

clip_image010

Figure 5 – Comment & Number tab

· Operators tab – See figure 6.

clip_image012

Figure 6 – Operators tab

Naturally, each user will personalize these dialogs to suit his or her taste. Notepad++ saves the configuration after you exit the dialog box. To use your custom language, you must choose it from the Language menu. Custom languages appear in a separate section at the bottom of the menu. Currently (version 5.6.8 at the time of this writing), there is no way to make your custom language the default language. You must select the custom language every time you open Notepad++ and every time you open a new file. This is a very annoying and stupid limitation. If you don’t assign an associated file extension for the custom language, the syntax highlighting for the default built-in language will be applied after you open the file. However, there are many user requests to make a custom language the default language, so it’s possible that the feature will appear in a future version of the product (hopefully, very soon).

Figure 7 shows an example of using your new DAX language to edit a DAX formula. Notice the vast improvements in readability over editing the formula in the PowerPivot Measure Settings dialog box, or in Notepad. The bottom left of the windows displays the language in use.

clip_image014

Figure 7 – Example DAX formula with syntax highlighting, bookmarks (blue blobs) & folding blocks

Pretty cool, no? However, we’re just getting started. The next step in customizing the language is adding function auto-complete, function parameter tooltips and function description tooltips. You heard that right. It takes some time to create the text for the auto-complete XML file, which must have the same name as the language (is this case the name of the auto-complete file would be DAX.xml). You must place DAX.xml in the Notepad++\plugins\APIs folder, and the turn on the appropriate Auto-Completion options in the Preferences dialog box. These options are shown in figure 8.

clip_image016

Figure 8 – Auto-Completion options

I’ve created a DAX.xml file, and anyone interested in using it can download it here:
http://powerpivotfaq.com/PowerPivot%20Samples/Forms/AllItems.aspx

The next couple of figures illustrate how function auto-complete and tooltips work.

clip_image018

Figure 9 – Function auto-complete

clip_image020

Figure 10 – Function parameter and description tooltip

The only functionality you lose from PowerPivot’s built-in editor is the very useful table and field names auto-complete. You can use the PowerPivot editor to auto-complete table and field names that you paste into Notepad++. In many cases, like that shown in figure 7, you end up reusing the same fields in different parts of the formula – which is then a simple cut and paste operation.

I encourage you to explore the many other features of Notepad++ not covered in this article. Other features I find useful include:

· FileàLoad Session & File Save Session. Because Notepad++ uses a multiple document interface (each document appears in a separate tab), you can save all the formula files associated with a project as a session.

· Bookmarks. You can toggle a bookmark on and off in several ways, for example: 1) click the right-side of the line number in the margin, 2) select a line and enter Ctrl+F2. You can also bookmark all lines containing specific text. Bookmarks are useful if you want to include folding keywords (as shown in figure 7), or comments inside the DAX formula. After you bookmark the folding keyword lines and comment lines, you can delete these lines by choosing SearchàDelete Bookmarked Lines. After you delete the bookmarked lines, you can paste the formula into PowerPivot.

As a final note, I’d like to say that Notepad++ has helped tremendously in managing my calculated measures. With complex formulas, I tend to create many intermediate formulas so that I can see the calculated results in the PivotTable. I don’t want to expose these intermediate calculations to the end user, so before deploying the PowerPivot model, I have to amalgamate the intermediate formulas into a single formula. It’s very unfortunate that PowerPivot doesn’t allow you to hide calculated measures.


RANK measure finished

March 21, 2010

 
Angel Eyes Always Sees the Job Through

 

“…I always see the job through.”

-Angel Eyes

Yep, I said I would return and finish off the Rank measure.  Time to deliver.

 
(And this makes three straight Spaghetti Western movie quotes!)

Where we left off – Unknown ranks #1

Last time, I’d written a mostly-working version of the measure that ranks a player by his Rushing Yards value, and that doesn’t return an error when I leave the Unknown player in the pivot:

DAX Rank Measure that Does Not Error 
But the Unknown/Blank player is still messing up my rank because of all the small-time players it lumps together into a huge number of rushing yards – I want Edgerrin James to be first.

Must remove the Unknown player from rank consideration

This turns out to be pretty tricky.  Here’s the formula again:

   COUNTROWS(
     FILTER(
       ALL(Players[FullName]),
       [RushYards](Values(Players[FullName])) < [RushYards]
     )
   ) + 1

As explained last time, that’s counting the number of players that have more rush yards than the current player in the pivot, and then adding one.

The heart of that formula is this comparison test – players only get counted if this evals to true:

   [RushYards](Values(Players[FullName])) < [RushYards]

I want to add a condition to that.  It’s not enough for a player to have more rush yards.  They also must NOT be Unknown/Blank.  So I want it to be something like:

   [RushYards](Values(Players[FullName])) < [RushYards]
      && PlayerIsNonBlank

Filling in that green part with a real formula is the trick.

Why not just remove the + 1 and be done with it?

Indeed, I could remove the “+ 1” at the end of the formula and voila! – all is right with the rankings.

But that’s only true right now, without the pivot filtered in any way.  There will definitely be cases where, as I slice the pivot down to narrow cases, no Unknown player recorded any Rush Yards.  And suddenly, the top real player would be ranked as –1.

“PlayerIsNonBlank” – the DAX solution

First I had to define a new measure, [NonBlank Player Count].  It is defined as follows:

   COUNTROWS(DISTINCT(CleanPlayers[FullName]))

It turns out that DISTINCT does not return Unknown members.  DISTINCT and VALUES are exactly the same function, except that VALUES includes Unknown and DISTINCT does not.  (Colin, sorry I never replied to your comment awhile back, but yes, that is the difference).

So that’s pretty handy.  When I am in the context of a real player, this measure returns 1.  And so I can use that measure inside my Rank measure, to weed out the Unknown player from rank consideration.  Replacing the green pseudoformula from above my condition is now:

   [RushYards](Values(Players[FullName])) < [RushYards]
      && [NonBlank Player Count] = 1

Which makes my overall measure:

   COUNTROWS(
     FILTER(
       ALL(Players[FullName]),
       [RushYards](Values(Players[FullName])) < [RushYards]
         && [NonBlank Player Count] = 1
     )
   ) + 1

Does it work?

Sort of.  Here ya go:

DAX Rank Measure - Fixed Numbers But Unknown Still Shows Up

So… the numbers are right!  Edgerrin James is now #1 as he should be, and everyone follows after.

But that pesky Unknown player.  It won’t go away…  it’s like Rasputin or something.

It’s ugly, and I want my reports to be pretty.

Furthermore, look at the Grand Total:

DAX Rank Measure - Grand Total Shows 1

Again, ugly.  The grand total cell should show nothing.

The fix:  making the measure return BLANK()

To solve this, I took the entire measure above and wrapped it in two nested IF’s:

=IF(COUNTROWS(VALUES(CleanPlayers[FullName]))=1,
   IF(NOT(ISBLANK(Values(CleanPlayers[FullName]))),
      <The entire measure formula from above>,
      BLANK()
   ),
   BLANK()
)

What that means is this: 

  1. If the current pivot context corresponds to exactly one player, AND that player is NOT a Blank player, then go ahead and evaluate the measure expression. 
  2. If the current pivot context corresponds to multiple players (as what happens in a Grand Total, or a Subtotal), return BLANK()
  3. If the current pivot context corresponds to a Blank player, again return BLANK()

This works great!  Check out the results at the top and bottom of the pivot:

DAX Rank Measure - BLANK for Unknown Player Means It No Longer Shows Up DAX Rank Measure - BLANK for Totals Too

When a measure returns BLANK(), that row disappears from the pivot, even the Grand Total.  Yay!

And if I add another measure that is non-blank for those rows, the Blank player comes back, but my Rank measure will display a blank cell rather than a number.

Are we done?  Depends on the intent

I said this was tricky remember?  Look what happens when I add another field to rows, like College Attended:

DAX Rank Measure - Rank Evaluates By Parent Not Overall

Neat!  Now my Rank measure treats each parent value (each College in this case) as a separate world in which to rank players.

Shaun Alexander is indeed the #1 NFL rusher from Alabama in my database.  Perhaps that’s exactly the behavior I want, and in some cases it will be.

At some point I will explore how to NOT do this – how to still display Shaun’s overall rank while nesting him under his college name.  But for now, this is enough to digest I think.