Where’s Waldo Contest – Two Hints!

July 28, 2010

No one has sent in a correct answer yet.  Two things I think are worth pointing out:

1) The logins are visible in the UI, and don’t require you to go digging through obscure files on the hard drive to find

2) They appear in PowerPivot for SharePoint, NOT in PowerPivot for Excel

Happy hunting.


Win an MSDN Subscription – Free Software!

July 27, 2010

UPDATED:  Two hints!

No one has sent in a correct answer yet.  Two things I think are worth pointing out:

1) The logins are visible in the UI, and don’t require you to go digging through obscure files on the hard drive to find

2) They appear in PowerPivot for SharePoint, NOT in PowerPivot for Excel

image001An Unusual MVP

I recently was awarded the title of Microsoft MVP for SQL Server, as a result of my community work around PowerPivot.  Good stuff – I get access to some newsgroups, a little advance info every now and then, get to attend the MVP Summit in February, etc… oh, and I get to put the neat little MVP symbol in my email signature, displayed here at the right.

A brave new world we live in, where a guy like me who knows hardly anything about databases lands in a SQL Server MVP program.  PowerPivot is a bit of a black swan, straddling the fence between hardcore db’s and the world of information workers.  Makes just as much sense for me to be in the Excel MVP program really, so hopefully I will still get plugged into those feedback loops as well.

MSDN Subscriptions for PowerPivot ContestsFabulous Prizes

OK, that’s great for me and all, but how does it benefit you?  Well, for starters, as part of the MVP packet, I got three MSDN subscription cards to give out!

These subscriptions must be activated within the next few months, and are good for a full year once activated.

I’m not an expert on all the particulars, but what it basically gives you is full access to the MS software catalog, for your own personal use and experimentation.  Office, Windows, SQL, SharePoint, Exchange, Visual Studio, etc. – a Golden Ticket of sorts.

Contest #1:  Where’s Waldo, PowerPivot Edition!

I recently discovered that some of the PowerPivot development team’s Windows Domain logon names were accidentally shipped in the product.  Pretty funny.

To win the first MSDN card, all you have to do is find where that occurs, and email me: rob@pivotstream.com

Simple huh?  Hee hee.  Whoever finds this is indeed worthy.

Rules Schmules!

If I receive multiple correct entries within the first 24 hours after this post goes live, I will randomly choose one of those entries as the winner.  This will give everyone a fair chance, regardless of where you live.

After the first 24 hours, first correct entry wins.

If I have reason to believe that you already have access to MSDN, I doubt I will award you the card, so let’s leave this for people who don’t have it already ok?

Future Contests

I have not yet decided how to award the other two, but will figure it out soon.  Also, I have a few other prizes that are a bit on the silly side – stuff from Microsoft with various logos on it that certify the wearer/user as nerd royalty – that I will also be giving away because my wife commands it :)

Watch this space for announcements.


Two Excellent PowerPivot Books

July 26, 2010

PowerPivot Books

Awhile back you may recall David Coe winning our XL Monkey Design Contest, the prizes for which were three unreleased (at the time) books:  two on PowePivot specifically, and one on Pivots in general.  Autographed by the authors of each:  Bill Jelen (Mr. Excel) for two of them, and Denny Lee, Ron Pihlgren, and Siva Harinath for the other.

Well, those books are all released now, and Bill/Denny have sent me the signed copies for delivery to David. 

Even better (for me), they each graciously included signed copies for a guy named Rob Collie.  So, PowerPivot books have supplanted Angry Birds as my pre-sleep nighttime routine for the past few days.

Humorous Aside:  Flattery will get you everywhere!

Shrewd promoters that they are, Denny and company had the wisdom to list me in the acknowledgement section of the book, even going so far as to list me first. 
 
PowerPivot Yoda 
PowerPivot Yoda says:
  “Wise is the author who prominently thanks those with the capacity to promote.”

 
Mr. Excel takes this even further, with the first two words in the book (after About the Author) being “Rob Collie.”  He even thanks my wife Jocelyn!

All future PowerPivot authors, take note of this.  (Actually, all authors take note of this, regardless of topic, heh heh).

Back to Serious:  Reviewing the Books

All of that fun stuff aside, I think I’ll briefly review these books here on the blog.

Since Bill’s book (the green one) arrived first, I’ve had time to read it already, so I’ll review that one first.

Excel People, Start PowerPivot Here

The arrival of Bill’s book is conveniently timed, since my last post was from an Excel power user who wanted content more tailored to his viewpoint and history.

My biggest overall conclusion after reading Bill’s book is that Excel users will be hard-pressed to find a better place to start their PowerPivot journey.  Bill is not a SQL guy and he is not an MS employee – he has been building spreadsheets in the wild since before Pivots even existed.  And for many years now he has made his living simply teaching others to get the most out of Excel.

That history and perspective shows through in the book.  Reading it is VERY different from reading any of the MS documentation on PowerPivot for instance – that MS content is excellent at describing PowerPivot and how to use it, it just isn’t written by a multi-decade Excel maestro, so it doesn’t tell Excel users, in detail, what will be familiar to them and what will be new.

Example:  the book contains a table listing all the pros and cons of PowerPivot-style pivots versus traditional Excel pivots.  I wouldn’t have come up with half of these differences despite my Excel pedigree, and I consider it the definitive list on the topic:

PowerPivot versus Traditional Pivots

Like a true Excel nerd, Bill even has a numerical Rating column, listing each pro/con as a positive/negative value, and then adds them all up at the bottom to generate +181 as the overall rating.  I wonder if Bill is like this at breakfast, comparing waffles to flapjacks using AutoSum?

(And yeah, I’m intentionally leaving the resolution poor – you’ll have to get the book, as I am not in the habit of republishing other people’s work like that).

Continues Throughout, Covers Every Aspect of PowerPivot

That perspective and experience is maintained cover to cover.  “Here ya go Excel pro, this is why you should care about feature X, when you should apply it instead of traditional Excel feature Y, and when you should stick with the traditional approaches.”

And it goes end-to-end through PowerPivot with this perspective, from data import, editing/cleaning, table relationships, DAX formulas of all types, pivot and slicer layout, formatting, workarounds galore, and touches on SharePoint at the end.  As I said, if you are coming to this from the Excel world, I think this is a great book for you.  It’s a quick, informative, and personable read.  Well worth the $23 at Amazon.

=IF(MOD([PageNum],3)=0,”Rip MS a New One”,”Wait til next page”)

Part of the personable thing:  Bill doesn’t spare MS when he dislikes something.  “Insane,” “crazy,” “hate” – these are a few of his favorite words.  In a few places he rips into decisions that were personally made by me, or by teams I led back at MS.  For instance, he hates the new Compact pivot layout introduced in Excel 2007.  Bill, I’m ready to duel over THAT one.  (Look for my upcoming blog post, The PowerPivotPro Went Down to Akron).

‘Pivotpro drove down to Akron
His fingers tightly grippin’ the wheel
He was looking to find
An Excel author unkind
To pivots’ excellent look and feel

What the book is NOT

Clocking in at 294 pages, this book doesn’t try to do everything, which I think is wise.  I don’t think any Excel pro wants to pick up, as a starting point, a 1200 page bible.  This book is an excellent intro and you will hit the ground running fast, but at some point later, you will eventually go looking for:

  1. An in-depth guide to high-powered DAX measures
  2. An in-depth guide to the implications of various table structures and relationships
  3. Performance-tuning reference
  4. A how-to reference for deploying PowerPivot for SharePoint
  5. List of best practices, tips and tricks, workarounds for Excel Services on SharePoint

Like I said, as an Excel pro, you are MUCH better off NOT trying to tackle those up front.  You can get incredible mileage out of PowerPivot without once touching those topics.  You will want to someday, but you don’t NEED to, so I highly recommend Excel pros pick up this book as their starting point.


An XL Pro’s Plea for More XL Pro-Focused Content

July 22, 2010

 
Just Don't Call Our Minds Simple, OK?

“Don’t you, forget about me.”

-Excel Pros Everywhere

From the beginning I’ve been talking about three different kinds of PowerPivot professionals:  Excel pros, SharePoint pros, and Database pros, and how they will all need to cooperate in order to get the most out of the system.

As far as I can tell, though, the majority of visitors to this site are from the Database camp (I lump Business Intelligence pros into this bucket).  Not surprising perhaps, since PowerPivot is a product of the SQL team at MS, and marketed heavily at conferences that db pros attend.

But for every db pro in an organization, there may be more than 100 Excel pros.  So, I think it’s clear that the PowerPivot message (and messaging) have a long way to go yet (it WAS just released, after all).

John Constant is one of the Excel pros who are “early to the party,” and has been sifting through all of the available PowerPivot materials for many months now, both the official MS stuff and the community offerings.

He raises some very valid points, things that I lose sight of at times.  It’s SO much simpler to grab a sample db from the SQL team and use it for blog examples for instance.  Even the Great Football Project starts with such a data source.

But I firmly believe that an organization’s success with PowerPivot rests in part on very competent training for the Excel pros…  and part of “competent” is “tailored to the Excel pro’s existing knowledge and viewpoints.”  In fact, I’d be out providing that kind of training this month if it weren’t for being very busy, the good kind of busy, applying PowerPivot for Pivotstream’s customers.

Without further preamble, I give you the words of John Constant:

PowerPivot vs the Excel Power user

Many moons ago, I can’t even remember where, I saw a comment…  Microsoft is coming out with a new business intelligence service… code named .. Gemini.  I was intrigued.  I had recently been tasked of coming up with some sort of data table /system  / process / spreadsheet / Excel voodoo to help put our business (and the competition) in perspective.  Hmm… a new tool you say.  So I delved into any links or articles I could find.  I saw the Donald Farmer video of Gemini sorting through millions of rows of video rentals – updating the charts/tables on the fly and .. sigh.. it was like love at first site.  I signed up for the beta at the first opportunity – archaic business software be damned! .  I was fortunate enough back in early fall 2009 to get my initial private beta Office 2010 release.  I was having a field day with the sparklines and the soon to be essential slicers, waiting with baited breath for the initial Gemini release.

The Microsoft talk was great.  It COULD be a standalone product.  It was meant for Excel power users and for IT and Data Administrators and Managers and so much more… but I’ve heard the talk before.  Don’t even get me started on the failings of Mappoint….   But low and behold – Gemini, with a few hiccups worked.  Sure if you linked in or brought in new data, it may corrupt all your work and you would have to start from scratch, but it’s beta.  Silly, silly beta.  It will grow up.  Sure there’s a new language that looks something like native excel formulas and there’s stuff that you take for granted working with Excel pivot data that you can’t do with Gemini… but it’s beta.. silly silly beta. It’s still growing.. be patient.  Help is on the way.  And help arrived.  There’s PowerPivotPro and PowerPivot.com  and Kasper and a host of other dedicated people who truly believe in the product.  And they do things to help build the community like having SQL workshops and handing out lovely diagrams (http://sqlcat.com/blogs/technicalnotes/image_4AFAE1C3.png) as a prize.

Errr… whoa..  hold on.  Did I mention the initial Microsoft talk?  Did I mention how Power users could use this product to help themselves and their company.  I understand the importance of SQL.  I understand the concept of Cubes; I’ve heard of OLAP (once or twice) but .. did you see that picture?!!  That’s a prize?!  Are you trying to scare away users?!   Let’s go back to Powerpivot.com, the ‘home’ of powerpivot (the grown up name of Gemini).   What’s that first video – the first public introduction of Powerpivot?  It shows a poweruser and PowerPivot for Excel.    Sure, it mentions Sharepoint but the debate continues – who is PowerPivot for? What about the power user?   – the grunts in the field who want to make the most of this product?  I know (reading some postings) there isn’t a large target population, but for some reason I feel like a Who in Whoville.. “We’re here!  We’re Here…. WE’RE HERE!!!!”  And what do I see …  honourable intentions of many esteemed PowerPivot bloggers aimed at….  SQL, OLAP, Business Intelligence users, like speaking to the already converted.

Don’t believe me?  Read your forums, your FAQ’s.  How are the samples set up?  Well you have FactInventory, FactSales, DimDate, DimThis, Dimthat – Dim Witted!   These examples and their structures aren’t what the normal excel user uses – or understands.  I’ve got Sales, I’ve got product, I’ve got territories, I’ve got stuff labeled poorly because they’ve been set up by someone who knew some programming about a decade ago… in other words, I have real word – DIM-LESS – data and structure.  It’s taken a few brick walls but I’ve hit my head enough times that the examples are sinking in but what about all those people that the ‘talk’ is supposed to reach?  Those who really have no experience with SQL or know of Cubes or data structure tables?  What about those people who can’t even consider getting Sharepoint because they are with a small business that doesn’t have the resources (financial or otherwise?).

I know why Microsoft is “selling” PowerPivot. I don’t fault them for that and I know the majority of users will have IT, Data Admin backgrounds, who will be setting up PivotViewers, and templates and services logs- all those lovely bells and whistles and the examples that will cater to those who deal with FactTables and DimData.  And then there were the Excel power users…  seeing a potentially great product aimed at the institutions, the data centres, the IT gods…   sigh…. 

I do not fault the Farmer’s the Collie’s, the Jonge’s, the Russo’s and the many many other dedicated PowerPivot supporters in the world – I thank them for all their hard work.  PowerPivot is a great product with great potential for many people, but just remember the little guy in the equation, the ones that someone deemed somewhere should have access to this power.  Bring it down a level from time to time… walk the walk and talk our talk and remember the Who’s… “we’re here!  We’re HERE… WE’RE HERE!!!”.

Powerpivot user – johncon aka Mongo41 on Twitter


Use Time intelligence functions to do a running sum of the last 6 months with PowerPivot

July 6, 2010

By Kasper de Jonge

A while ago I did a PoC using PowerPivot, both to show Self Service BI with PowerPivot for SharePoint but also as a datasource for SSRS reports. I used DAX to solve all difficult request, mostly making use of the Time Intelligence functions.

One of the requests was if we could show a running sum of the last 6 months. I had not done that before, so today we take a look at how to do this with DAX. It appeared to be rather easy :)

Again we use the almighty Calculate function to change the context of the row we are in. We want to do a sum of all the rows of the last 6 months of data. We use the DATESINPERIOD function to get the last 6 months of dates.

The function DATESINPERIOD has 4 parameters, first is the column containing the datarange, second is the start date. We can get it by getting the last date in context, we use LASTDATE to get this. As last two parameter we can give intervals to subtract or add from the start date. In our scenario we want to subtract 6 months from the last date.

This gives the following formule:

=IF(ISBLANK(sum(FactInventory[DaysInStock])),BLANK(),
CALCULATE(sum(FactInventory[DaysInStock]),
DATESINPERIOD(DimDate[DateKey],
LASTDATE(DimDate[Datekey]),-6,MONTH)))

Again not too hard :)

I put in a ISBLANK to check if we have values for the current month, we are not interested in the future :)

This gives us the following result:

I decided to check out the DATESBETWEEN as well, the DAX function below gives the same result:

=IF(ISBLANK(sum(FactInventory[DaysInStock])),BLANK(),
CALCULATE(sum(FactInventory[DaysInStock]),
DATESBETWEEN(DimDate[Datekey],
FIRSTDATE(DATEADD(DimDate[Datekey],-5,MONTH)),
LASTDATE(DimDate[Datekey]))))

The DATESBETWEEN function can be used to return a table of dates between a start and a end date.
As end date we need to get the last date that is available in the current context. We can determine this by doing LASTDATE(DimDate[Datekey]). The first date is a little more confusing. We need to get the date 6 months ago, of course we can use the DATEADD function. This will return us a a set of dates of the current context, in our sample we get a set of all dates in the month. Using firstdate we get the first date of this set to start our datesbetween function.


Slicers and pivot update performance

July 2, 2010

-Posted by Rob Collie

 
CHI0000402_P

OK, I was sitting at the 2nd-day keynote down in New Orleans last month, watching Amir Netz do a demo on some PowerPivot goodies we can expect in the next release.

He was showing off a 2 Billion row model, and its amazing performance, and while that was eye-opening, something he said in passing caught my attention in a big way.

 

In an effort to explain that what we were seeing (split-second pivot performance against said 2 Billion rows), he said the following:

“Now look at this report!  It has 4 pivotcharts and 6 slicers, each of which issues 2 queries whenever I click a slicer, meaning this report is actually querying the 2 Billion rows twenty times!”

Each slicer issues 2 queries against the data source????  It was revelation time.  I kinda missed the next couple minutes of what Amir was saying, as I digested the implications for our work at Pivotstream.

And, I was kicking myself for forgetting this, because I had once known most of this in Redmond: 

Slicers can, if used improperly in a report, end up slowing a report down by a factor of 5, 10, or more.

Tip #0:  Don’t overthink this!

Hey, if your report is fast, don’t obsess.  Move on to something else.  It is not worth monkeying around with your set of slicers to trim half a second.  Remember, slicers are the difference between report consumers loving your work and dismissing it as just more nerdy junk.

When a client recently told us “Reports are dead, now we have Pivotstream!” there was no way we would have received that reaction without slicers.

But every now and then you will find yourself with a report that doesn’t operate quickly enough for your purposes, and slow response times can drain the value and utilization out of a report quickly (well, slowly I guess).

In those cases, one of the first places you should look is your usage of slicers.

Why do slicers issues 2 queries each?

Well, I won’t go into great detail here, because  1) I don’t know every detail   and 2) Vidas covered the tech details quite well here.

What I will give you, though, is an intuitive sense of what a slicer has to do.

First of all, it has to populate itself.  I don’t know how often it checks to see if something has changed in the underlying model – I’ll have to run some tests.  But it’s something to think about:

Tip #1:  Are you using a field from your measure table as a slicer?

Say you have a 2 million row table, and one of the columns is Date.  When you park that Date column on a slicer, you are demanding that the slicer populate itself from all of the distinct values of the Date column.  That doesn’t happen for free – the PowerPivot engine is gonna have to work pretty hard just to generate that distinct list, and it might have to do that every time you interact with the report (click a slicer, change a page filter, etc.)

So, this is yet another reason to consider using a separate, smaller table that just contains all unique dates, and then relating that back to your large table.  Then you can use the field from the smaller table in your slicer, and PowerPivot won’t have to burn so much time doing something unnecessary.

Cross-filtering:  the incredibly useful but sometimes crippling feature

You know that cool feature of slicers where, as you make selections, other slicers update to show which tiles are valid (clickable) in those conditions, and which tiles are not?

For instance, consider this example:

PowerPivot Slicers and Cross Filtering

OK, in week 7 (of all NFL seasons combined in this data set), there were 8 receiving touchdowns scored by players weighing 179 pounds or less.

Now check out the FullName slicer – I have not clicked anything in it myself, but all player names have been disabled except four.  That’s because those four players are the only ones under 179 pounds who caught TD’s in week 7.  Cool!

Of course, the identification of those four players doesn’t come for free.  To drive that point home, I’m just gonna add a second measure to the pivot:

PowerPivot Slicers and Cross Filtering Depends on Measures

Hey look!  Now there are eight players enabled.  That’s because there were four players who weighed less than 179 pounds and recorded receiving yards in Week 7.

Stated more generally, any player who has data for ANY of the measures in the pivot, in the conditions dictated by the other slicers, will be enabled.

What does this mean to us?  It means quite a bit, actually, once you understand how it works.

Under the hood:  what Excel does w/ slicers during a pivot update

I’m sure I will miss a detail or two but that’s not the point.  The point is to give you an intuitive understanding.

  1. Excel takes all of your slicer selections, plus the layout of your pivot (rows, columns, measures), bundles all of that up into a query, and sends it off to the PowerPivot engine
  2. The data comes back and Excel can populate the pivot with numbers at this point
  3. But now Excel still needs to determine, for each slicer, which tiles to enable.
  4. So for every single slicer, Excel does the following:
    1. Takes all of the selections from every other slicer, bundles those up into a query with the measures from the pivot…
    2. …And then adds the field of the current slicer to the query, unfiltered, as if it were on the Row Labels axis of a pivot
    3. Sends that whole query off to PowerPivot, waits for a reply
    4. When the data comes back, only the values of that slicer field for which at least one measure has data, will be in the result
    5. Excel ignores the measure values returned and uses the list of returned slicer field values to enable/disable tiles
    6. Excel then moves onto the next slicer and repeats steps 1-5

That’s a lot huh?

Turning that knowledge into action

There are many tips we can derive from that understanding.

Tip #2:  Reduce the number of slicers you use on a pivot

From the above, you can tell that the query being sent for a given slicer is absolutely on par with the query that is used to populate the body of the pivot with data.  In fact in some cases, that slicer query can be MORE time-consuming that the pivot itself!

So, if you remove a slicer from a report that your consumers rarely use, you can take a HUGE chunk off of the time it takes for the report to respond to user interaction.

Furthermore, every time you add a slicer to a report, you also make the queries issued for the other slicers more complex, so removing a slicer might make the other slicers’ queries speed up as well.

Tip #3:  Watch out for slicers with lots of tiles

I have not confirmed this but feel confident enough to share it anyway:  I’m pretty sure that slicers with long lists of tiles are more expensive to update than those with a small number of tiles.  Customer Email Address, for instance, is probably much more impactful on performance than Gender.

Tip #4:  Reduce number and complexity of measures

Just something to think about.  A measure that uses the FILTER function, for instance, is going to be a lot more time-consuming to crunch than a straight SUM.  And you might already be 100% cognizant of that.

The point here is that when you add one such complex measure to a pivot, you are not running that measure once per report interaction.  You are running it once for the pivot AND once for each slicer, so it adds up faster than you think.

Tip #5:  Use a report filter instead of a slicer

Traditional report filters don’t have this cross-filtering behavior, so they don’t impact performance in the ways outlined above.  You can use report filters without fear of slowing down your report.

But report filters are ugly, clumsy, and they scream Windows 3.1 – if report filters were so great, we never would have built slicers in the first place.

So I have a better suggestion…

Tip #6:  Disable cross-filtering for slicers that don’t need it

Here ya go.  Right click a slicer and click Slicer Settings.

In the dialog, you can uncheck the highlighted checkbox:

Slicer Settings Disable Cross Filtering Items with no Data

…and now you won’t get cross-filtering anymore.  All players will be enabled for example:

PowerPivot Slicers and Cross Filtering Disabled

Well, for a list like Players, that may not be the best idea.  There are thousands of them, and cross-filtering is REALLY useful for helping me narrow down the list and find the players I want.

Then again, having a field with thousands of values in it as a slicer runs contrary to Tip #3, so maybe it’s not such a good idea to have such a slicer in the first place.  Can you live without it?  It’s worth asking yourself.

Tip #7 Consider using other, shorter fields for slicers

Sometimes I like to use the same field in a slicer and in the pivot itself:

Same Field on Slicer and on Row Labels PowerPivot

This is just a lot more convenient for limiting the set of players (sometimes) than using the Row Labels filters.

But again, this can incur performance cost, especially for long lists, which is precisely where you may be tempted to “double park” a field like above.

The alternative that’s worth thinking about:  maybe a field that’s just the first letter of the player’s last name would be sufficient.  Or the team they play for.  Or whatever.  The idea is to reduce the number of tiles, per tip #3, and sometimes you can get what you want by using a different field.


PowerPivot field list and Excel field list at same time!

June 26, 2010

-Posted by Rob Collie

OK, ignore my last post, that was just silly.

Yesterday one of my fine colleagues at Pivtostream mentioned that he was seeing both field lists simultaneously.  Then followed an exchange where I said “BS,” he said “bet me some cash, oh PowerPivot boy” and, well, he was right.

One should be careful never to wager against one’s own boss.

Anyway, in beta releases of the addin, it was IMPOSSIBLE to bring up the Excel field list without disabling the addin.  Apparently, the team decided at the last minute that allowing the user to bring up the Excel field list could compensate for the lack of certain features.

It’s a brilliant move on their part.  So here it is, both field lists, living together in harmony:

Built in Field List and PowerPivot Field List

You don’t have to disable anything.  Just go to the Options tab on the ribbon when you have a pivot selected, and click the field list button there:

Excel Field List Button

Implications Galore!

OK, so…  you can do the defer layout update trick which is excellent for long-running queries, without turning off the addin, that’s huge.

But wait, there’s more :)

Can Move Measures to Rows or Up and Down Relative to Other FieldsYou can also adjust the layout of your measures.  Try dragging the little Sigma “Values” capsule around the Excel field list (shown here at right) and you will see what I mean – move it to Row Labels for instance.

This also comes in handy when you have other fields on Column Labels and want to change whether Measures are nested inside of those fields or vice versa.

But the last trick that this side by side field list stuff enables is truly my favorite…

 

 

Batch Creation of Measures Without Refresh!

Follow along, this is fantastic.

Step one:  turn on defer layout update:

Defer Layout Update

Two, go create a measure:

Create New Measure Without Refresh

Three, ignore this error you get when you enter the formula and click OK:

Ignore This Formula Is Invalid Error

Four, observe that the measure was created anyway:

Measure Was Created Anyway With No Pivot Refresh

This changes my life in rather significant ways.


Quick tip: disable addin for rapid layout

June 23, 2010

-Posted by Rob Collie

UPDATE:  It turns out that all of this is unnecessary, there is no longer a need to disable the addin to get these benefits.  (I was still stuck on what was possible in the Betas).  See the latest post on this topic for details.

If you’re like me and you work with large data sets and/or complex measures all the time, you may be annoyed by the time it takes to layout 6-10 fields on a pivot – a handful of measures, a field or two on rows, a couple of slicers… even with the performance of PowerPivot, you can be burning a reasonable amount of time in the click-and-wait game.

I have a trick for you.

First, disable the PowerPivot addin.  Yep, you heard right.  On the File tab, go to Options:

Excel Options

Then select Addins.  At the bottom of that screen, open the dropdown for Manage, and pick COM Addins from the list:

Manage Addins

And then in the dialog that comes up, uncheck the PowerPivot addin:

Disable PowerPivot Addin

Now go to the Options tab on the PivotTable ribbon and enable the Field List:

Field List Enable Button

OK, now you see the native, built-in Excel field list.  If you are familiar with normal Excel pivottables, this will appear foreign to you, because this is the “OLAP” field list – designed strictly for use with OLAP db’s like Analysis Services, which PowerPivot happens to be under the hood.

Anyway, at the bottom of that field list, you will see the “Defer Layout Update” checkbox.  Check that, then you can make a bunch of changes to the layout of the pivot WITHOUT them refreshing the pivot.  Click Update and all of your changes will run in a single batch while you go get a fresh cup of coffee:

Fast Pivot Layout With Defer Layout Update in PowerPivot

Cool huh?  A few things to note:

  1. You can’t create measures without the addin enabled.
  2. You also don’t get the fancy slicer auto-layout.
  3. The field checkboxes are organized very differently, split into measures and non-measures AND by table – you will see what I mean
  4. There are a few “dummy” measures visible in this field list that are hidden in the addin

OK…  I should also mention that we might have had this feature already built-in to the PowerPivot v1 field list if it weren’t for me :)  It kept coming up during development – “should we build that defer layout feature in?”  And I kept saying “no, not yet.”

Why would I do such a thing, you ask?  Two good reasons.  One, we were running WAY low on time and resources.  By NOT doing that feature, the team was able to do several other important things that today we’d be complaining about instead. 

Two, I knew that if we added that feature, as we worked on PowerPivot internally, we would all turn it on whenever we hit something that was slow.  As a result, we’d not feel the pain of performance issues as acutely as we should, and not as many things would be identified and fixed.  So, like Cortez burning his ships, I wanted to deprive us internally of this crutch.

Of course, now that I no longer work at MS, and I find myself in a job where I crunch hundreds of millions of rows daily, irony has come home to roost :)  Just kidding really – I still think it was a good v1 decision.  I’m positive that engine query performance is much better for it, and it will be easy for the PowerPivot team to add this feature in v2.


Using Excel CUBE Functions with PowerPivot

June 21, 2010

By Dick Moffat
Personal Logic Associates Inc.

Busy, busy with exciting Access Services applications the last few weeks and my PowerPivot writings have fallen behind…. Sorry.

But today I am going to give you a quick and dirty example of what I think is one of the key features of PowerPivot that will give it a much broader initial and on-going impact for experienced power spreadsheet developers.

This is where existing spreadsheets can get the value-add of having PowerPivot data available to them in a way that is not only understandable for the traditional spreadsheet junkie (as opposed to the typical BI one) but that also that will add major value while integrating into existing models.

Excel CUBE Functions

A few weeks ago my friend Dany Hoter wrote a piece here about the use of the CUBE functions in Excel with PowerPivot data.  Dany’s article was a great intro to the capability of this unsung capability of Excel available since Excel XP.  I am going to go deeper into this issue here.

The Cube functions require access to an Analysis Services On-Line Analytical Process cube (OLAP) which has to be provided to the user from SQL Server’s Analysis Services application.

While there are companies around the world that are capable of taking advantage of this capability, this is by no means a large percentage of overall Excel users and from my experience it is generally an unknown and untried feature.

In Excel 2007 (and now 2010) the CUBE functions became native to Excel (as opposed to available through an Add-In) and were integrated with the new “Connection” object within the program.  Once again though, if one wanted to use an OLAP Cube in a Connection it required availability of an Analysis Services Cube of data.

But in Excel 2010, thanks to PowerPivot, users can now create their own “Cubes” inside PowerPivot and they automatically present themselves as an available Connection inside the Excel parent file.  This is a big thing and I hope to show you another reason why.

A Cube Automatically?

I’m sure I don’t have to explain here how one creates a PowerPivot data source consisting of multiple Relational data sources.  But one fact that may not be obvious is that the data set that is created by these PowerPivot objects, and by their relationships, is in fact a “Virtual” Cube in itself.  In the purest sense it is a ROLAP Cube – created at run-time from Relational data.

This default “Virtual” Cube is named “PowerPivot Data” and is exposed when you click the Connections Button on the Data Tab:

This is inherently an OLAP Cube conceptually and so is an acceptable source for Excel’s native CUBE Functions.

So what does this “Cube” Do For me?

OLAP “Cubes” were designed for two major reasons:

  1. To allow analysis of data in a hierarchical fashion.  In the business world that usually means Financial or Sales or other data organized by the hierarchy of business units (Region/Country/Zone/State/County/City) or by dates (Year/Quarter/Month/Week/Day) or by Product lines (Product Category/Product/SKU), where a number or numbers (i.e. Account balances or sales quantities or values) are stored at the lowest  level of detail (i.e. by Day or by SKU) and the total value or quantity can then be “Rolled Up” along any Hierarchy or Dimension and across dimensions.  So for example you want to see total sales for March 2010 of a particular Product and its SKUS in a particular Country….  Then you want to be able to “Drill up” or Drill Down” on any value in any dimension down to the lowest level or up to the highest easily and automatically.  This is where a Pivot Table attached to one of these Cubes is a natural presentation and analysis mechanism.
  2. To take disparate data sources and aggregate them automatically using the hierarchies in 1 above.  In the original implementations of OLAP Cubes the hardware available had limited RAM and slow processors and so many OLAP Cubes had to be created over-night and written to disk in the classic De-Normalized format that allowed for relatively quick queries to be made against data that otherwise would simply not be possible in a Normalized format.  But this is the 21st Century with incredibly cheap and sizable RAM memory on every PC and with processors so fast that it’s hard to believe.  This is the work that PowerPivot is designed for – so it can reconstitute your OLAP “Virtual” Cube at run-time inside your Excel file itself (and with compression to boot).

So if you bring in your sales data (like in the example Bike data available with PowerPivot samples) and set it up in a PowerPivot “Connection” you can refer to that info from your spreadsheet layer using not just Pivot Tables, but also using the Cube functions.  But the Cube functions give the traditional spreadsheet maker (and his/her boss) the flexibility to present this data in the classic free-form spreadsheet style as Income statements with analysis or to group them on the sheet in any way they want.

The Example

The data being used in this example is available in the “AW_CompanySales.accdb” database supplied as an example for use with PowerPivot. Keep in mind that there are many “Memo” field-types in this example database and it would be worth your while to change most of those fields to a Text format and add a few Indexes to improve the performance of the database when importing into PowerPivot.

This is an example of what the key tables look like when imported into PowerPivot :

When you create Relationships between these tables within PowerPivot, the result is a Virtual Cube with the ability to “Roll-Up” the core values (“Facts”) along any Dimension of the data, which actually are fields in the data sources of the table being used for Hierarchical relationships.  So Product Categories in the “ProductCategory” table have children in the “ProductSubCategory” table, which have children in the “Product” table.  This is the fundamental design of a “Snow-Flake” OLAP design (read about it!) and it should be fairly intuitive to anyone who knows the data sources and how they are related conceptually.

In my example, however, I have pulled together this same data into a single large De-Normalized PowerPivot table by creating an Access Query.   The result is exactly as PowerPivot does virtually within its “PowerPivot Data” Connection and hopefully will help me explain this functionality better.

This is how that PowerPivot window looks (in an unfortunately stiched together image I’m afraid):

Do you notice how there are multiple instances of Country and State across records (?)  This is in effect a result set from an Access query or SQL View or SQL Statement and is truly de-normalized.  There will be certain scenarios where this might work to your advantage over bringing in individual tables and joining them inside PowerPivot, but that’s an issue for another day. There is little or no impact on file size for the two ways of importing the source data however.  It is meant to help us see our data easier in this example.

Keep in mind that there are a million daily records in this data set and is a recipient of the exceptional compression algorithms of PowerPivot.  It is also worth noting that the single table de-normalized version is 1/10th the size of the file with multiple tables (3 megabytes compared to 30 megabytes).

The “Measures”

Once you have the data in place within PowerPivot you have to take the next step and create the Measures you will be working with.  Measures are the core of OLAP Cubes and are also the core of PowerPivot.  Measures are the aggregations that you will be analyzing based on the data in columns you wish to SUM or COUNT or AVERAGE (or whatever of the standard Excel Mathematical and Statistical functions you want to use).

The most basic example of a Measure is ours called “[Sum of SalesAmount]’, where it simply sums the [SalesAmount] column at every conceivable level of the Virtual OLAP Cube inside Excel,  created by PowerPivot.  In this case [SalesAmount] is simply a basic data column provided directly from the data source, but there is no reason why you can’t use DAX to create new Calculated Fields of calculated data and then create “Measures” based on those fields.

To create the aggregations you want in your spreadsheet you need to have created the “Measure” inside a Pivot Table in the Excel environment either within the “drag and drop interface or using the menus.  I created our Measure using the Pivot Table interface like this:

I started to create a Pivot Table from inside my PowerPivot environment (or in the Excel environment from the PowerPivot Tab), and then I dropped the [SalesAmount] field from my Field List into the Values section below.  By default the Pivot Table assigned this as a “SUM” function and it created the Measure [Sum of SalesAmount] automatically.  It is now available to the Workbook, not only to any Pivot Tables you might create, but also to any CUBE functions that you might want to use.

This does NOT mean you have to have a Pivot Table in your Workbook, BUT you do have to have started the process in order to create the Measure or Measures you want to refer to in your CUBE functions.   You can create the Measures from within the PowerPivot Pivot Table menu as well,  but using the Excel Pivot Table GUI this way strikes me as more intuitive and equally capable.  We can then rely on this Measure to aggregate the total “Sales Amount” at every level across any Dimension in our Virtual OLAP Cube.

The Spreadsheet:

So here’s the spreadsheet I want to feed from PowerPivot:

The CUBEVALUE() Function

In order to drive right to the heart of this functionality I am going to use just one of Excel’s Cube functions, CUBEVALUE in this model (but will show one more function later as well in an enhanced version).

So we are looking for the Sum of the Sales Amount ([SumofSalesAmount]) of Bikes ([Category]) for Fiscal Years 2006 ([FiscalYear]) for Australia ([Country]).  So this is the pseudo-code for this formula:

Sum the Sales Amount where Category = “Bikes” and Fiscal Year = 2006 and Country = “Australia”

Notice that the values are in the PowerPivot tables at the lowest level by Product, by City, by Date of the “Fact” Sales Amount.  Each value is shown as a member of a Week and Fiscal and a Calendar Month and a Fiscal and a Calendar Year.

So when we ask for values from the Fiscal Year, the Category and the Country fields we are inherently asking for aggregated totals ate the intersection of those three fields.  That is what you do when you look for values in an OLAP Cube.

This is the CUBEVALUE() function to get this value for “Bikes” in Australia for Fiscal 2006 :

=CUBEVALUE("PowerPivot Data","[Sum of SalesAmount]","[Country].[Australia]","[Category].[Bikes]","[FiscalYear].[2006]")

Which returns a total of $91,490,280.05.

Therefore to get the equivalent value for “Accessories” in Australia for 2005 the formula would be:

=CUBEVALUE("PowerPivot Data","[Sum of SalesAmount]","[Country].[Australia]","[Category].[Accessories]","[FiscalYear].[2005]")

And the result is $902,316.21.

If you have ever used a SUMIFS Function in Excel (or SUMIF or even IF functions) then this kind of formula is pretty straightforward IMHO.

The “Inherent” Connection

Notice that this function is a DIRECT reference to the “PowerPivot Data” Connection rather than referencing another cell that has a reference to the Connection?  If you were connecting to an external data source you would likely want to make a single cell the one that links to the data source and then you would “borrow” that link for all subsequent references to the Cube.  Otherwise you would have a HUGE performance hit.

But if the Connection is a “Virtual” Connection in PowerPivot inside the spreadsheet itself, then there is no such penalty.  So you can refer to the “PowerPivot Data” inherent connection in every formula. That is much easier to understand and to audit as well.

Adding the Power of Excel to the Mix

As a loooong-time spreadsheet user I have developed a fondness for the “Naming” capability of Excel.  By applying a “Name” to a cell or cells I can then refer to them everywhere in the Workbook without regards to the actual Sheer and Cell reference of the range.  This makes things more auditable and just generally more readable.

In addition, I believe in the use of cell references for Variables in formulae rather than hard-codes words and values.  To an experienced Spreadsheet developer this is simply “Best Practices”, but I still see many instances of “Hard-Coding” in Workbooks that make me cringe.  If the value will (or may) change it must be a cell reference or a named reference.  Of course ALL references could be named either as individual cells or arrays but I tend to use only single cell Variable names as it is likely easier for others to comprehend than using an multi-cell reference as an array in a formula (but there’s no reason nt to do so if you feel so inclined).

So by changing the View to “Formulas” in Excel these are the contents of the key cells and the first column of our little model:

 

Using Cell References and Names

The reference to the “PowerPivot Data” is located in cell B11 on the spreadsheet and is named in a way that would make it easy to reuse and to understand (“strConnection”).   This is the first expression in the CUBEVALUE() function.

The reference to the Field we want is in cell B12 and refers to the Measure we created [Sum of SalesAmount].  This is the second expression in the CUBEVALUE() function.

Referencing the Dimensions

Now that we have defined the Connection to the PowerPivot data and the definition of the Measure that we will be summing, we have to tell the CUBEVALUE() function the values we want to select for across the various Dimensions we want to analyze.

PowerPivot is capable of intuitively determining the Field (or Dimension) that the value you are looking for is from based on the contents and the data types of the fields in the OLAP cube.  This is way cool … but I am not quite ready to use this capability in my spreadsheets.  Instead I simply use a [Fieldname].[Value] syntax to tell the function what I am looking for.

So the criteria in cell B4 are going to be:

"[Country].[Australia]","[Category].[Bikes]","[FiscalYear].[2006]"

But I am using the contents of cells to drive my selections … so the correct syntax is:

"[Country].["&$B$5&"]","[Category].["&$B7&"]","["&$C$5&"].["&C$6&"]"

The result is exactly the same, but is driven by the contents of cells rather than being “hard-coded”.

There is no doubt that there are advanced CUBE functions that will draw the values in each Dimension automatically into the cells based on the contents of the data and also we can drive all of this using Slicers (as I will demonstrate later) but in many, many business scenarios the business is organized in a stable and consistent way. Month over month managers want to view certain values according to an organization that may or may not match their organization or groupings of the data coming in from outside sources.  So in our little example I am most concerned about Bikes, Accessories, Clothing and Components and I want all others to fall into a catch-all category called “Other”, and this total is calculated in cell B7 by taking the total for Australia’s sales for 2005 and subtracting that from the SUM of the three Categories looked at distinctly above:

=B8-SUM(B4:B6)

The ISERROR function is necessary in case there is no data for the combination of dimension values you chose and returns a zero rather than an error.

So what we have here is a simple spreadsheet model that uses the cache of PowerPivot data by drawing on the CUBEVALUE() function to query the OLAP cube inside the PowerPivot cache and to return the totals in the database at the intersect of all the values in the Dimensions my spreadsheet asked for.

This is serious spreadsheeting …..

Adding Slicers to the Mix

So let’s take our same example and add some of the new Slicer functionality of Excel and PowerPivot.

Rather than setting up a separate spreadsheet for every Country the company does business with I have created a Slicer that is pulling a unique list of all the values from the Country Dimension.  This Slicer is then tied to cell B5 and not only shows the name of the Country but also drives the formulas that use the [Country] Dimension of the PowerPivot OLAP cube.

So cell B5 has this formula in it:

=CUBERANKEDMEMBER("PowerPivot Data",Slicer_Country,1)

Which is referencing the value of the Slicer named “Slicer_Country”.

Cell C6 has this formula:

=CUBERANKEDMEMBER("PowerPivot Data",Slicer_CalendarYear,1)

It is of course referencing the slicer that is driving the Calendar year.  Cell D6 is just the contents of C6 minus 1 (which could also be derived using a CUBE Function as well – but this works for me).

In order to get the totals for all countries one need only click on the Funnel in the upper right of the Slicer.

Conclusion

There is no doubt that even in this little example of a spreadsheet there are a lot of moving parts, but if you understand how this model works you will be well on your way to integrating PowerPivot into many, many of even your existing Excel spreadsheets.  Or at least this will definitely change how you design future ones.   I have spent years emulating this functionality using VBA and DAO and ADO and customized functions for which I have sometimes been handsomely paid.  But this capability of PowerPivot moves the bar way forward and will allow you and me to finally start seriously integrating Excel into the world of serious BI analysis.

While there is a lot of talk about DAX and the Filtering capabilities of PowerPivot I think that I will be applying PowerPivot to many of my spreadsheets in the way detailed here.   With an understanding of the Relational nature of your data and with a strong knowledge of the capabilities (and flexibility) of Excel I believe that many of you will be able to bring value to your use of PowerPivot in Excel 2010 in short order using the CUBE functions against the inherent OLAP cube that is a PowerPivot data cache.

Dick Moffat

London, Ontario

June 19, 2010

dick@plogic.ca


TechEd / BI Conference Slides

June 17, 2010

I’ve received requests for the slide deck that I presented with Dave Wickert at Microsoft TechEd / BI Conference last week.

Well, here ya go, the whole slide deck right here:

PowerPivot Best Practices from TechEd

Yeah, that’s right.  One slide.

You see, it was an “Interactive” session, which is basically “we jam people in a room and they pepper us with questions for 90 minutes.”  The ground rules given to us were “1-2 slides to set the tone, and then after that it’s all audience-directed.”

Going in, I had mixed feelings about that format.  I mean, there’s a real danger that people won’t be prepared to ask questions, and that the conversation won’t get going.  In fact, in many other Interactive sessions, the speakers decided to ignore the ground rules altogether and simply present.  But it seemed like a potentially entertaining approach, and hey, Dave and I were so busy going in that this minimal preparation thing sounded pretty damn good.  So we rolled the dice.

Well, it turned out EXTREMELY well.  We started 15 minutes early, ran over the end time, and were answering fantastic, high quality questions non stop.  It was awesome.

The bad news is, I don’t have a good way to share that discussion after the fact.  We basically would need to do it again, and sadly, they did not record the session.

So for now, all I can really do is offer the teaser above.