OK, I’ve spent a lot of digital ink lately on the wonders of cube formulas, an alternative to pivots. And [link removed due to 404] Denny’s recent series of posts may suggest to you that you can’t really use Pivots effectively in Top N scenarios without employing MDX and named sets.
So are pivots a weak technology?
No, they most certainly are not, and can handle > 90% of the scenarios you throw at them (maybe more as you discover more of their feature set). So I thought I’d take a quick break from cube formulas to point out a few strengths.
OK, here I’ve got a PowerPivot workbook with some tables loaded from AdventureWorks. I’ve got a simple PivotTable defined, where unique customer email addresses (all 18,000+ of them) appear on rows, and Sum of Sales Amount on values. The pivot is sorted descending by sales.
All right, I want to see the top 10. Easy, if you use the Top 10 value filter feature (activate this popup UI using the highlighted dropdown on Row Labels):
So far, I’m doing exactly what Denny did in his first post. This yields:
OK, no problems so far. Time to clear up a potential misconception, though.
Misconception #1: PowerPivot-backed Pivots Do NOT Filter Locally
When you have a pivot based on PowerPivot data, when you select Top N (or any other label or value filter, or even just the checkboxes), Excel does NOT fetch all of the values and then perform the filtering itself.
Instead, it constructs a query (it’s an MDX query, but you never have to see it), and sends that query over to the PowerPivot engine. The PowerPivot engine performs the filtering and then only returns the matching values.
So no matter how many customers you have in the db, Excel only gets 10 back from the PowerPivot engine. Performance is therefore still good, and you don’t run into Excel’s 1 million row limit.
…but when you first add the field to the pivot, before you’ve had a chance to define the top N filter… ok, there’s a place where it’s gonna be slow (and potentially truncated).
A Misconception About the Misconception 🙂 – Sorting IS Local
Clearly, if Excel lets the PowerPivot engine do the filtering, it must also let PowerPivot do the sorting, right?
Actually, no. Excel DOES do the sorting itself. Go figure. (It makes sense, but it’s a long story… and I happen to have forgotten the answer. But it made sense to me when I remembered.)
Problem: Top 10 Filtering By a Different Column
In Denny’s example, though, he didn’t just need to show Sales, and the top 10 customers by Sales. He actually wanted to base his top 10 set of customers off of one column, and then for those customers, see what the values were like in a second column.
Misconception #2: When I Want That, I’m Screwed
OK, let’s say that, for the top 10 customers in overall sales, I want to see the Quantities those customers ordered in each Product Category. Add a couple fields to the pivot and it looks like:
Good news: it’s still the same set of customers, and the information I want about quantity per category is there.
Bad news: there’s a lot of extra information there that is not needed. I don’t really care about the dollar amount purchased by these customers – I just care that they are my top 10 by dollars, and then I only want to see the quantity information.
I want to get rid of Sales. But don’t I have to display Sales, since I am “top 10-ing” by Sales?
Turns out that pivots have another trick up their sleeve. Let’s go back to that Top 10 dialog. Check out the highlighted dropdown:
Interesting, eh? (See that Denny, I used your trademark. Nothing is sacred when ‘Softies play the feud!)
Look what I find in that dropdown – a measure that’s not currently used in my pivot!
Now THAT is even MORE interesting. That tends to suggest that I can remove Sales from my pivot and yet preserve the “top 10 by Sales” filter.
(Those “_Count” measures are a necessary implementation evil in PowerPivot v1. You can ignore them, which also means you HAVE to ignore them, but you get the idea.)
Yup, I can indeed remove Sales and the filter is preserved. Check it out:
OK, I lost the sort order – it used to be sorted by Sales, now it appears to be alphabetical. But it IS the same ten customers. So we kept the filter.
Kinda makes sense that we lost the sort, given that Excel sorts locally. That might be a problem in some cases, in other cases not. They haven’t figured out the MDX to order the set in the named set approach either, so I’m gonna give myself a pass on this one 🙂
Advanced Topics – 2 Quick Questions
1) Does that Top N respect other filtering?
Why yes. Yes it does. So if I use a slicer, for instance, to filter to a year prior to Adriana19 even becoming a customer, my top 10 will change to be the top 10 for that year only.
2) If I want a top N filter on Sales, can I define that top N based on just sales of Bikes?
YES! You can do that by creating a “Bike Sales” measure:
…and then do my Top N filter based on that!
(Exclamation point because value filters based on anything other than a grand total of a measure have always been a bit of a holy grail for, well, me. And Allan Folting of the Excel team. Maybe a few others.)
The key takeaways from this post:
- Pivots have a Top N filtering capability built-in
- That filtering takes place in the PowerPivot engine for excellent performance
- You can filter on measures that aren’t displayed in the pivot (!)
- The top N DOES respect other filters on the pivot
- When combined with CALCULATE and custom measures, you can do just about anything
- Excel DOES perform pivot sorting itself (in contrast to filtering)