skip to Main Content

Ok, in the last football post, I had written the following measure:

          PowerPivot DAX Calculate Zoomed

Which basically means, “sum up the [Yards] column but only those rows with [PlayTypeName] = “RUN-run” “

Now we get to find out if it works 🙂

I need some real-world data that I can validate against.  Out on NFL.com I can find historical player stats.  Let’s look at one of my favorite players from recent years, Priest Holmes.  He amassed a lot of Rushing Yards and should be a good sample.  Here are his statistics by Game for 2004:

          Priest Holmes 2004 Rush Yards

OK, so let’s make a PivotTable that just shows Priest’s 2004 Rush Yards by Game, and compare that to the Rush Yards column from above:

         PowerPivot Says Same Numbers as NFL          Priest Holmes 2004 Rush Yards Zoomed

                           My Pivot Table                                      From NFL.com

Bingo!

“You know sometimes I even impress myself?”

-Han Solo

An exact match!  That might seem a bit bland to you folks out there, but to me, it is VERY exciting.  I’ve got 40+ tables, a Plays table that is loaded with crazy complexity, no database training whatsoever, and yet…  I now have a measure that agrees, DOWN TO THE YARD, with NFL.com!

Wahoo!  In all honesty, when I started this project, I had no idea how far I could go.  I intentionally chose something that *might* defeat me.

I’m feeling pretty darn optimistic now, though 🙂

I’m suspicious, however…

I recall that my Plays table contains multiple rows per play, so that, for instance, it can capture data about the *defensive* players involved in a play:

Single Rushing Play

Right there, a single play, three rows.  One for the runner himself, one for the player who tackled him, and another for the player who assisted the tackle.  That will come in handy when I get started looking at defensive stats.

But for now, I suspect that means defensive players are getting credited with rushing yards, too.  When I filter to Antoine Winfield, the Assister from above, my PivotTable confirms my suspicions:

         Antoine Winfield Rushing Yards

What we are seeing there is the total net yards of running plays in which Antoine Walker was involved in tackling the runner.  I don’t want that.

But all I have to do, then, is add another clause to my CALCULATE function.

Calculate Function Fixed

And now the pivot table shows:

Antoine Winfield After Fix    Priest Holmes After Fix

Priest Holmes is unchanged.  Antoine Winfield now has no rushing yards.  Perfect!

CALCULATE is good.  CALCULATE is your friend.

“OK Rob, a whole post just so you can make ONE change??”

Yes.  I spent more time than usual on this.  Here’s why:  Everything above took me less than 10 minutes in real time.  It took far longer to capture the screen shots than simply to blaze through it.

But when I worked with a BI consultant, a few years back, the same exact iteration took about a week.

PowerPivot Compared to Traditional BI Development

How do we account for the difference?  Is it because the consultant produced better results?  Was the Rushing Yards measure more accurate or robust than mine?  Was it somehow more formalized, more robust?  Or was the consultant not very good?

The answers are no, no, no, and an emphatic “no.”  (The consultant was fabulous, akin to godlike.  There’s that word again – “akin.”  Why do I keep saying that?)

The real difference, as I’ve said before, is that with PowerPivot, the “modeler” and the “business user” are the same person.  I’m the one writing the expressions, and the one who knows the most about what I want, because I know the “business” (football, in this case) inside and out.  Iteration in one person’s head is blazingly fast.

  1. Finding the comparison stats on NFL.com – Only the business user knows where the best validation data sets are.
  2. Creating the pivot for comparison purposes – even just choosing my player for comparison reflected business knowledge. 
  3. Realizing that I likely was incorrectly counting defensive players in my measure – because I had access to the source tables and the “business rules” in my head, I spotted this problem before it ever made it into a report.

Seriously, this was like a trip down memory lane, on hyper fast forward.  I’d get a cube from the consultant, build some pivots, see that things were not accurately reflecting football rules, point out the problem, wait for the next version, repeat.

Not anymore 🙂

“Are you saying we don’t need BI Pros anymore?”

No, I am NOT saying that.  Even in this football example, I am cheating.  The original data from STATS arrived as a jumble of text files.  The schema was terrible.  Many of the required attributes (like the score of the game on a particular play) were completely missing.

There was a TON of work, done by the BI professional, to get from that horrible mess to the 40+ tables that I am working with now in PowerPivot.  And there is no way, no way at all, that I could do that myself – not back then and not today.

In terms that an MS BI Pro understands, the Integration Services work remains.  In fact, it becomes even more important, since you need to make the resulting schema not just work for people like you, but also for people like me 🙂

But the Analysis Services work – you can start sharing that with the Excel business users.

Next Football Post:  The Horrors of Bad Data Sources >>

Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology.

This Post Has 7 Comments
  1. Rob,

    I wish I could remember why it took a week… But still thanks for saving the consultant’s face…

    And I agree that even the analysis of well defined problems like NFL data may take time when need to be analyzed professionally through the eye of the expert of the data domain who lacks the skill to build the infrastructure required such analysis.

    1. Heh heh, it had nothing to do with you Akin – you were superb 🙂

      I think it’s just the nature of the beast. Arcane business football rules are hard to communicate even to myself, much less to the uninitiated 🙂

    2. It took a week because it took that long to extract the information from your head in a form that the analyst could understand and translate into computer form. Monday you realized you had a problem, so you contacted the analyst and set up a meeting for late that day or next morning.
      By lunch Tuesday the analyst had first draft of revised requirements. He went away, chewed on the information for the rest of the day. Then contacted you to meet again on Wednesday.
      Wednesday you ironed out some minor miscommunications and agreed you were on the same page. He went away and started working the required changes.
      Thursday the changes were finished, then he had to create test data to confirm it worked to requirements.
      Late Thursday or early Friday you got to see the fix
      There is your week. Maybe the timeline was a compressed a little to allow an earlier trial run that identified problems that the analyst had to go back and fix for a second round of testing.

      I like to call this the “Dental Extraction” method of analysis and design. It is a long, slow, drawn out process that hurts like heck. No one enjoys it, but it has to be done because it eventually gets the job done …

      Sure when you compare it to you playing both rolls it is a long and drawn out process. Sure you doing it all is the idea situation that MS is aiming for. BUT it is not reality!

      You (even 10 years ago) do not represent the average business user! You have an IT background. You have indepth knowledge of the tool.

      The average business user knows the business! That is what they are paid to do, not learn Excel and PowerPivot to an expert, or at least advanced amateur level. Maybe it will come to be, over the next 10 to 20 years, as more children who grew up with computers in their hands since before grade school make it in to the workforce and management. Think of the situation a hundred years ago. Business experts did not type their own documents. They had secretaries to do it. Why, because their time was better spent creating the content than it was in documenting it. Now “everyone” born AFTER the baby boom knows how to type, so more people do their own emails, documents, and simple spreadsheets. They have been using simple computers most of their lives, so they can now do at least simple tasks that used to require “experts” like secretaries and low level IT people to do.

      My whole IT career has been shadowed by vendors selling programmerless computer programming. And managers who worked under the assumption that IT worker skills were universal. That IT workers are totally interchangeable, that could be moved around like leggos rather than puzzle pieces that may fit in more than one place.

      Another comparison is Website building. 10-20 years ago it took an “expert” to setup and run a website. Now ISP’s are selling DIY website building using templates and simple “drag and drop” tools. Sites that are running on servers hosted by the ISP. The result is simple sites, created by “average” or slightly above average users that may not follow all the “good design rules”, but that are enough to generate revenue for “mom & pop” home businesses.

Leave a Comment or Question