image

Paper or Plastic:  Measure, or Calc Column?

“What?  I’ve Never Written This Post?”
<pause while I search the archives>  “No???  I Haven’t???  Sheesh.”

It’s such a fundamental question.  When you need to write a PowerPivot formula, which kind should you write – a calculated column or a measure (now renamed “calculated field” in 2013)?

Well, there is no perfect 100% answer to this question.  But there are a handful of crucial differences and guidelines that make your choice clear 95% of the time.  And the other 5% generally don’t matter.

So let’s get to those guidelines and differences, shall we?

Guideline #1 – When a desired number is a fixed property of a single row, calc columns are usually appropriate.

Here’s a common example.  You have a Sales table, in which every row is a separate transaction.  The table has [ProductCost] and [SalesAmount] columns and you want to know how much Margin (Profit) is made on each transaction.

It is perfectly reasonable and desirable to add a calc column that calculates this for you:

image

Calculating Margin as a Calc Column is a Good Thing

What do I mean by “a fixed property of the row?” Well, each transaction only made so much margin. And that margin amount is NOT something that is subject to change. It’s an immutable “truth” about that transaction – unless you discover that the original cost or sales price was mis-recorded in the first place.

Another example would be in a Calendar table (where each row is a single day), adding a column like IsWeekday or, in the example below, YearMonth:

image

Each Day in the Calendar Can Only Be Part of One YearMonth, and for a Given Day,
That YearMonth is NOT Subject to Change – July 31, 2003 Will Be
Part of “2003-07” Under ALL Circumstances

Which brings me to my next rule.

#2 – If you want to put it on rows, columns, slicers, or report filters,
it MUST be a column. 

Measures (aka Calculated Fields) Can Never Be Placed in Zones Other than Values

You can’t put a measure in any of these five zones.
So use a calc column (or a regular column of course).

This is a simple rule and it’s airtight.  There is no way to put a measure anywhere in a pivot except the Values area.

YearMonth, above, is a classic example of something you’d create in order to put on Rows for instance.  That’s a calc column, zero doubt.

“Hey, wait a second.  Your Margin example…  THAT’s a calc column but I’m going to put in in Values!”

image

The Reverse is NOT True – a Calc Column CAN be Added to the Values Area
(And Very Often WILL Be!)

Returning to the first example (Margin), I clearly am not “done” once I have created the calc column.  I need to use it on a pivot before I get any insight from it.  And a number like Margin is not something I would typically put on Rows or Columns – it’s something I want to sum up.

So am I breaking the rules if I create a calc column and then add it to Values?  No, not at all.

More accurately, I think the question people struggle with the most is this:

The Real Question:  Skip the Calc Column and JUST Write a Measure?

Deep down I think this is the question that is most unsettling to Excel people – “am I missing the point?  Should I have written a Margin measure and SKIPPED the calc column?” 

No, you are doing it right.

And that’s what the first guideline – the one about being a property of a single row – is all about.  In those cases, using a calc column (and then adding to Values, or even writing a Measure that references the column) is perfectly OK.

I think that matches the instincts of anyone who has worked with traditional pivots.  What I am telling you is that you need NOT feel guilty about trusting those instincts!  Guideline #1 – the “property of a row” guideline – is one you always intrinsically knew with normal pivots, but never put it into words.

OK, So When DO I Write Measures???

Let’s say you use PowerPivot all the time.  And you write lots of calc columns but never write measures.  Are you doing it wrong?

No, the things you are doing, you are probably doing correctly.  It’s just that you’re just missing out on OTHER things you could be doing.  And those things are some of the most magical things in PowerPivot.

In short, you write measures to do things that no calc column CAN do.

It’s hard to list all of the things measures can do for you.  Most of the book is spent on that in fact.  But here’s a quick summary:

  1. Aggregate formulas – things that don’t deal with a single row.  Your average selling price across all transactions for instance.
  2. Numbers that need to change as you rearrange the pivot – technically speaking, you CAN write aggregate formulas in calc columns.  But those numbers don’t “respond” to the pivot – they are fixed.

Here’s an example of that.  I have a SalesTerritory table that is related to my Sales table.  I can certainly write a calc column, in the SalesTerritory table, that determines the average SalesAmt per transaction for each territory:

image

Yes, You Can Perform Aggregate Calcs in Calc Columns.  But Typically You Shouldn’t.

When I put that calc column on my pivot I get the same numbers as in the table above:

image

But put Year on the pivot, and the numbers don’t change:

image

Northwest reports $405.86 for ALL Years.  Which is Incorrect.

Calc Column Results are Forever.  Until the Next Data Refresh Anyway.

So that’s a critical thing to understand about calculated columns:  their numbers never change in response to how you lay out your pivot.

Specifically, I mean that for each row, a calc column value is “stamped,” permanently and statically, onto that row:

image

That $405.86 Number NEVER Responds to What You Do on the Pivot
(Yes You CAN Sum it etc. in a Pivot to Form Other Numbers, But the
$405.86 Itself Can Never Be Subdivided, Sliced by Year, etc.)

Again, this is really nothing new.  Traditional pivots work the same way.

More comparison to traditional pivots

1) If, when using a traditional pivot, you’d have written a calc column and then fed that calc column into a pivot, you’re pretty safe writing a PowerPivot calc column and feeding that into a PowerPivot pivot.

2) If in the past, you would have made more than one pivot, and then referenced into those pivots with formulas, many of those situations can be accomplished in PowerPivot with a single pivot and a measure.

image

If in the past you would have created multiple pivots and then referenced into them
with formulas, that’s a “hint” that a measure might be a better approach today

And as a bonus, the measure formula is portable whereas the multi-pivot approach is anything but.

3) If in the past, there was an analysis that you didn’t think could be done in Excel, or was one that was so much work that you decided not to attempt it, measures might make that problem simple.

Wrapping Up

So, we use calc columns in PowerPivot just like we did in traditional pivots.

We use measures to calculate aggregate expressions, especially when we want those to “react” to different pivot layouts.

We use measures to do magical things that we could not have done in traditional pivots, like Average Sales per Day, or Sales vs. Budget %, or Change vs. Prior Year, or “pick a topic of almost any blog post on this site or most of the chapters in the book.”

Advanced Scenarios

David Churchward, the Italians, Chris Webb, and other “DAX monsters” might be reading this post and disagreeing vehemently.

To those DAX monsters, I say:  this post is not for you guys Smile

Yes, there are some advanced techniques where writing a measure is actually more efficient and faster than writing a calc column and then adding it to the pivot.  But those still largely go over my head, and might remain that way for some time.

The problem I’ve recently re-discovered is that measures are still so new to most Excel people that they are questioning which traditional techniques they can keep versus which ones they need to replace with measures.  So let’s keep it simple – for their sake and mine.

And yes, if you can calculate a column in the source database rather than in PowerPivot, there are benefits that I highly recommend.  But that doesn’t invalidate anything I’ve said above, since they are still just calc columns (calculated somewhere else).

Implicit Measures

Readers of the book may be wondering if I am backtracking on a recommendation I made, which is that I NEVER create “implicit measures.”

If I want to see the Sum of Margin in the Values area of a pivot, I always create a measure and write the formula myself =SUM(Sales[Margin]) rather than just dragging Margin to values.

Nothing in this post changes that for me. I still highly recommend always using explicit measures (write the formula, even if it’s simple) rather than implicit (dragging to Values area).

  Subscribe to PowerPivotPro!
X

Subscribe

Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology. 

This Post Has 13 Comments

  1. I often use calculated columns to create a dimension. Your temperature mashup was a perfect example. You converted a continuous temp variable to 3 discrete values. Also, I’ve been doing a lot of distribution alanysis and the only I know how is to add a calculated colum to count “stuff” and then use that calc column as my X-Axis to sum “other stuff”

    BTW – I’t be great if you could have lookup related tables (like using the lookup paramater “1” in the match function

  2. Your article says that calculated columns cannot be changed based on anything you do in Excel. So there’s no way to create a Slicer and have it change what field appears in the Row Labels of a pivot table? Just as I use disconnected slicers to change what metric is visible, I would also like to change the row field to group by, say, color rather than location. Is there no way to do this at this time?

    I was hoping to build a calculated column like:
    =SWITCH( , “Location”, [Location Name], “Color”, [Color Name], BLANK())

    Then use this column as Row Label in my pivot table. My pivot table could then change from:

    Row Labels Unit Count
    US 1,253
    Canada 815

    to:

    Row Labels Unit Count
    Green 502
    Red 810
    Blue 756

    simply by selecting the Slicer. Why this is important is that my report has 30 pivot tables on it and I would like a single Slicer to make the change across all tables with one user action. Thanks!

    1. I agree.
      I am trying to use the sumx() function in PBI and experiencing weird summations when references columns in a formula between to tables. The sumx() are summing lineitems and the other two columns are taking header figures. the SUMX() seem to be doing all rows in the sub table.

      Here is a sample PRInvoiceTotal = (SUMX(PBIvwInvoicePriceModel,PBIvwInvoicePriceModel[OtherHdrCharges]) + [PRServiceCharge] + [PRLinenMaintenance] +
      SUMX(PBIvwInvoicePriceModel,PBIvwInvoicePriceModel[PRLineTotal]))

      If I try to add sumx(filter()) I don’t see any restriction going on:

      PRInvoiceTotal = (SUMX( Filter(PBIvwInvoicePriceModel, RELATED(PBIvwDeliveryPriceModel[Customer#])), PBIvwInvoicePriceModel[OtherHdrCharges])
      + [PRServiceCharge] + [PRLinenMaintenance] +
      SUMX(Filter(PBIvwInvoicePriceModel, Related(PBIvwDeliveryPriceModel[Customer#])),PBIvwInvoicePriceModel[PRLineTotal])
      )

      Any suggestions?

  3. Hi guys,

    Thanks for the awesome post. So there is still no way to use measures in a pivot table and have them refreshed while using slicers ?

    Thanks in advance

    Cheers,
    Arun

    1. Hmm, I was about to say “hey be nice!” but I think you’re right. This is actually a terrible example. Today, I’d just do Margin as a measure most of the time, unless I want to use Margin on Rows/Columns of a pivot (or a grouped version of the column, like Low/Med/High or something).

  4. Hi,

    I have a sheet which gives me hourly deliveries as a cumulative sum starting from 7AM to 11PM. Every hour will have 4000+ rows with different agent names. I need to show their hourly delivered numbers.

    Ex:

    Hour Agentname Deliveries
    7 XYZ 10
    7 WXY 12
    8 XYZ 21
    8 WXY 19

    I need this to be:

    7 8
    XYZ 10 11
    WXY 12 7

    Please suggest.

  5. I have a sales table in the following format:

    Customer, Sales_Amount, Date
    Walmart, $10, 1/1/2016
    Target, $20, 1/2/2016
    Walmart, $15, 1/1/2017
    Target, $30, 1/2/2017

    and a commissions table (percent commission to be paid to the sales rep for the customer) in the following format:

    Customer, Commission_%, Year
    Walmart, 2.5%, 2016
    Target, 3.0%, 2016
    Walmart, 1.5%, 2017
    Target, 1.0%, 2017

    So the commissions are each sales multiplied by the appropriate commission percent for that customer for that year.

    I could add a calculated column for the sales commission for each sales in the sales table using Lookup_Value (or something similar):

    [Sales_Amount] * lookupvalue(Commissions[Commission_%], Commissions[Customer], Sales[Customer], Commissions[Year], year(Customer[Date]))

    or write a measure using sumx on the sales table and the filter on the commissions table to get the commissions:

    sumx(Sales, [Sales_Measure] * calculate([Commissions_%_Measure], filter(Commissions, Commissions[CUSTOMER] = Sales[Customer] && Commissions[YEAR] = year(Sales[Date]))))

    from purely a performance standpoint, which approach is better?

    My model is getting large and I want to make it overall more efficient.

    Thanks for any help.

Leave a Comment or Question