skip to Main Content

Another one from the Mr. XL Forum!

…and the word “forum” gives me an excuse for a blast from the past…

PowerPivot Court

“What you are about to witness is real. The participants are not actors.  They have agreed to dismiss their tech support cases and have their questions settled here, in our forum:
THE POWERPIVOT COURT

(Click here for theme music!)

The much-neglected calculated column

OK, neglected by me, not by everyone else.  At Pivotstream I have the benefit of a great SQL team.  Generally speaking, when I need a calculated column, I ask them to provide it.  The benefits of doing that are documented several places on this site, including here and continued here.

But hey, not everyone has a database rapid-response team at their disposal Sad smile

So questions like this one come up a lot on the forums, reminding me of my blind spot:

You have two tables of data.  In one, you have a list of your clients and their effective discount rate during particular date ranges – marked by a start and an end date.  If the End date is blank, that discount rate is assumed to still be active.

image

Rates Table:  Discount Rate per Client,
Over Distinct Date Ranges (Start and End Effective Date)

Then there’s a second table, listing Clients and the days that you called each of them:

Multiple Clients, and Multiple Calls to Each Client in PowerPivot

Calls Table:  Multiple Clients, and Multiple Calls to Each Client

Desired Result

You want to add a calculated column to that Calls table which contains that client’s effective discount rate on the date of the call:

Calc Column in PowerPivot Looking Up a Value Based on Date Ranges in Another Table

We Want to Add the Highlighted Calc Column – What’s the Formula?

So… what’s the formula?

=CALCULATE(AVERAGE(Rates[Rate]),
   FILTER(Rates,
          Rates[Start]<=Calls[ContactDate] &&
          Rates[EffectiveEnd] >= Calls[ContactDate] &&
          Rates[Client]=Calls[Client]   )
)

Hey, That’s a Lot Like a Measure!

Yeah, it IS a lot like a measure.  It uses CALCULATE, the wonder function.

But in this case, we’re just using CALCULATE to apply filters in our calculated column.

Works basically the same way, though.  The three clauses in the FILTER() function are all AND’d together using the && operator, so that only rows from the Rates table meeting the following three criteria are “kept” and then “fed” to the AVERAGE function:

  1. The Start date in the rates table must be prior to (or equal to) the Contact Date in the Calls table.
  2. The End date in the rates table must be after (or equal to) the Contact Date in the Calls table.
  3. The Client ID in the rates table must match the Client ID in the Calls table

Note for those who desire deeper understanding:  The biggest difference between this calc column formula and a similar formula we’d have to write in a measure is that in a measure, we’d have to use the VALUES() function wrapped around the right hand side of the criteria – VALUES(Calls[ContactDate]) rather than just Calls[ContactDate]).  And we’d have to “protect” those VALUES functions with an IF(COUNTROWS()) since you can’t use VALUES() as part of an “=” test, except when there is only one value returned, or maybe use MAX() instead of VALUES().  The point is that in a calc column, we DO have a current row in the table, so we don’t need the wrapper function like we do in a measure.

What’s that EffectiveEndDate column?

Yeah I sneaked that one in.  It’s another calc column I added to the Rates table:

image

You can see its formula in the formula bar in the image above.  I just wanted to put a non-blank value in whenever End was blank.  You could hard-code a date in there if you wanted, or use TODAY() and a much bigger number than 365.  I just figured that setting a date one year in the future from today is good.  Keep in mind though that TODAY() will only be re-evaluated when you refresh your data or force a re-calc.

Is There a Relationship Between Those Two Tables?

No, there is not.  Neat huh?  And really, there couldn’t be anyway – each Client ID appears more than once in each table, and PowerPivot doesn’t handle relationships on “many to many” like that.  The date columns are even less “relationship friendly” since they don’t match up at all, and are based on ranges in one table and single dates in the other.

Download the Workbook

I’m trying to do this more often, especially with the forum questions.

Download the workbook here

Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology.

This Post Has 17 Comments
  1. This is actually something I’m doing now in ‘regular’ Excel but with a large dataset it takes ages to calculate. I thought of using PowerPivot for the task but haven’t been able to try it yet, this is exactly what I need to get started. Thanks!

    A question though; why do you use AVERAGE in the formula as only one value in the Rates table will match all the criteria?

    1. Oooh! I debated asking others whether they thought this was easier to do in normal Excel but decided the post was running too long. So I am very interested in your results – both the speed of calc as well as the complexity of the formulas. Please share when ready 🙂

      In answer to your question: CALCULATE returns an error if I don’t “wrap” the first argument in a function. I don’t have to use AVERAGE though – I could use MAX, MIN, etc. I find myself in this situation a lot actually – where I know there is only one row returned, and AVERAGE, MIN, MAX will all return the same answer, so I pick one – usually MAX. But in this case, if I have lots of rows in the Rates table, even I would be reluctant to “trust” that I never got more than one row from the folks maintaining the Rates table, so I picked AVERAGE rather than MAX, to “blend” results in that case.

      Also, given that Leah switched from AVERAGE to MAX, it’s clear that she *does* expect to return more than one row in her usage cases.

  2. Instead of creating a calculated column that replaces BLANKS by a date in the future you could just check for BLANK in the actual formular:
    =CALCULATE(AVERAGE(Rates[Rate]),
    FILTER(Rates,
    Rates[Start]= Calls[ContactDate] || ISBLANK(Rates[EffectiveEnd])&&
    Rates[Client]=Calls[Client]
    )
    )

    1. something got messed up with the greater-than and smaller-than signs in my previous comment so I will describe the approach in words 🙂

      the check on Rates[End] can be extended by ISBLANK combined by OR
      [ContactDate] has to be greater/equal to [Start] AND ([ContactDate] is smaller/equal to [End] OR [END] is Blank)

      gerhard

  3. I don’t understand the use of the average() function. Whea if I wanted to return an exact value. Like a range lookup in excel. My buckets are as follows:

    Min Max Grouup Name
    0 5,000 1) $0 to $5,000
    5,000 10,000 2) $5,000 to $10,000
    10,000 25,000 3) $10,000 to $25,000
    20,555 50,000 4) $20,555 to $50,000
    50,000 1.0E+06 5) $50,000 to $1,000,000

    I want to append the group name based on the columns value

  4. Well you need to use *some* aggregation function, as you cannot directly reference the GroupName column as the first argument in a CALCULATE.

    Since in this case I know in advance I will only get one row matching my FILTER, I could have used MIN(), MAX(), or AVERAGE(). All would return the same result.

    In cases where you want to return a text string, you can use the VALUES() function. But you need to be certain that you will never match more than one row in the FILTER – if you do, you will get an error.

  5. This is brilliant if it works (no doubt it will). I deal with some very large extracts from SAP which are date driven and was hoping to be able to chuck the tables into Power Pivot and lookups across tables and this has explained brilliantly how to do it. Muchos thanks 🙂

    1. Try FIRSTNONBLANK(your text column, 1) or LASTNONBLANK.

      Those functions aren’t “meant” for this purpose but by using 1 as the second input, we “trick” them into being min/max for text.

  6. Hi Rob, in the same example I want the effective end date to be picked in the Calls table. Please help me.

  7. I have used this example for calculating the exchange rate conversion and it pretty works well. But, why not solve this with a measure? I was learned that using calculated columns is a bad practice that should be prevented.

  8. Fantastic solution. Thx. Quick question also, You used this to fetch the Rate from other table, can same be applied for text. Not something numeric. Instead of average what will I have to use?

  9. Having some trouble with this. Formula seems to be written correctly, but my column returns a bunch of blanks. Any help would be appreciated!

Leave a Comment or Question