skip to Main Content

No more messing around, let’s dive in. 

Rather than boiling the ocean up front, let’s set a modest goal for getting started:  I want to end up with a report that shows passing yards by player.  (PowerPivot lends itself quite well to this approach – no need to build an all-encompassing model up-front like with traditional BI tools.  This is more like…  Excel.)

First, I launch Excel 2010 (with the PowerPivot addin installed), and using the PowerPivot ribbon tab, I open the PowerPivot window:

01 PowerPivot Blank Wkbk

(I’m not yet set up to do videos, but will be soon.  For now, let’s make do with pics).

Then, since my football tables live in a SQL db, I use the “From Database” button, type in my server and db name, and end up here:

02 PowerPivot Select Tables

As I showed yesterday, there are 40+ tables in there.  Some of which I probably won’t ever need.  And I certainly don’t need them all in order to deliver my first report.

But it’s just so much easier to grab all of the tables now.  That way I can browse them all, look at the data, and decide which tables I need for the report in a hands-on manner.

So I just click the little “select all” checkbox at the top-left.  Great feature…  that I argued against for awhile.  I was wrong.  I use it all the time.  Amir and Olivier, allow me to apologize publicly 🙂

Hit Finish, and I wind up with this:

03 PowerPivot Import Done

A few things to note:

  1. I’ve maximized the PowerPivot window.  Don’t worry, I can switch back and forth to Excel as much as I want, without closing this window.
  2. There is one “sheet” per tab per table from the database.  Gives Excel users a comfortable way to browse the data.
  3. I am actually seeing the data, just like in Excel.  No abstract schema browsers here.  We get real data, all of it, to work with.
  4. I have expanded the sheet navigation “overflow” popup.  Given that PowerPivot will be working with a lot more “sheets,” on average, than Excel, it needed a revamped navigation concept.  (We stole this from OneNote – thanks folks!)

All good.  But there’s a problem here, too, that you may have noticed.

I have dirty players!

No, I’m not talking about Hines Ward.  (Leave the guy alone, ok, he just “plays hard.”)  I mean that my players table contains some bad data – players for which Stats Inc. did not record player names, and instead recorded numbers.

What do I do about this?  The PowerPivot window itself is read-only – I can add formulas, which I’ll show later, but I cannot edit or delete rows. 

Why Read-Only?

The reason for this is the remarkable compression employed by PowerPivot – some db’s shrink to 5% original size.  And that same compression is used, in-memory, to deliver faster query results at report and analysis time. 

Jamming individual edits into that compressed structure, while maintaining lightning-fast query speed – well, let’s not go there.  Outstanding compression and query perf.  It’s a good tradeoff, especially given the other methods available to me.

OK, so how *can* I clean the Players table?

I have three options, which I will cover in the next post.

Next Football Post

Rob Collie

One of the original engineering leaders behind Power BI and Power Pivot during his 14-year career at Microsoft, Rob Collie founded a consulting company in 2013 that is 100% devoted to “the new way forward” made possible by Power BI and its related technologies. Since 2013, PowerPivotPro has rapidly grown to become the leading firm in the industry, pioneering an agile, results-first methodology never before seen in the Business Intelligence space. A sought-after public speaker and author of the #1-selling Power BI book, Rob and his team would like to help you revolutionize your business and your career.

This Post Has 7 Comments
  1. Did I miss something? Aren’t you starting with a data source that was prepped for your OLAP cube? The one that took all that time and money to build that wasn’t supposed to be necessary with PowerPivot? Don’t get me wrong…I love where this is going. But there’s a ton of serious work to be done to get the data to this point, right?

  2. Hi Tom 🙂

    I’m being completely honest when I say that I am repeating the vast majority of the work that we paid for. OK, the data did arrive from Stats as text files rather than as a normalized db. So the consultant did load that into SQL, and added columns indicating the current score on each play, because oddly, Stats hadn’t done that for us.

    But that’s where the real work began. As I proceed, you will see that there is a ton of logical filtering to be done, as well as layer after layer of “business logic” (in this case, football logic) to be applied, before useful data comes out the other end.

    Take a close look at http://powerpivotpro.files.wordpress.com/2009/10/powerpivotfootballfacttables1.jpg, the FactEvent table specifically. There isn’t even a Receiving Yards column. There’s a Player Role column, and a Yards column. That’s how it came from Stats. So even the simplest numerical measures are missing until some interpretive logic is applied.

    Taken another way: the consultant represented IT, and I the business unit. The consultant (IT) converted the text files into SQL with little or no guidance from me (the business unit). That is also a reasonable thing to expect in the average PowerPivot environment (except for mashup data which I’ll mention shortly). Pretty much any end user reporting/analysis solution is going to struggle if the core business data is hard to get ahold of in a reasonable format.

    But at that point, there had been very little discussion of business logic. What followed was a painstaking process of me teaching the consultant all the rules of football, all of the stats-keeping procedures of football, the roles of players and teams on different types of plays, and how certain analytical stats (like Move Net) should be calculated. The consultant was a very bright guy, but the sheer iteration of it chewed up weeks of billable time. We’d talk, he’d go off for half a day, produce something, I’d realize I hadn’t explained completely (or he had not understood completely), so he’d go off for another half day. Rinse, repeat.

    I contend that sort of iteration comprises the majority of a cube project. In fact, that’s true of many software projects. Transferring in-depth knowledge from one expert to another takes time. And it was actually a two-way transfer – I was teaching him football, but he was educating me on the properties of a cube so that I could make informed tradeoffs.

    Now imagine how much faster that can be if all that iteration takes place in one brain. The business user knows the business logic, and they know the programming tool (Excel). Iterating inside your own head often doesn’t even take the form of verbal language – it’s instinctive, decisive, and can happen without you even noticing. I don’t think it’s a stretch to say that one head can be 100x faster than two once the subject matter is as deep as this stuff.

    That’s one of the reasons I’ve enjoyed working on PowerPivot so much. Yeah, it’s technology. But at its core, its value proposition sits squarely on an analysis of human and social factors. It’s tech that was built to fit human strengths and weaknesses, rather than tech that expects the humans to conform.

  3. Rob, so here I pop up once again! No I’m not a stalker 🙂

    Sheet navigation “overflow” popup? Must be new in CTP3…and a great idea. I’ve always wanted something like this in Excel, but with a hierarchy of workbook, workseet, names, formulas etc. Would be great for navigation and auditing…but I digress.

    The questions Tom posted are similar to those I’ve wondered about. I do think that an emormous amount of work has to be done to “prep” the data in a warehouse (or mart). Isn’t data warehousing something like 70-80% of a typical BI implementation project? So PowerPivot now increases this proportion to 95%, ’cause the other 20-30% that would normally be taken up by cube development is now reduced to 5% with PowerPivot!

  4. Ya, I love the overflow popup for sheet nav, but it’s not new in CTP3. Just resize your window small enough that you can’t see all the tables at once, and it should be there on the right.

    I didn’t say this in my other comment, since it was already a book, but here it is as crisply as I can say it:

    1) Core business data (like Sales) very much still should be warehoused in a disciplined manner. PowerPivot does nothing to change that – PowerPivot still benefits quite a bit from warehoused data, but also, it’s not going to be the only app consuming the warehouse. There are many other centralized applications that need the warehouse, and PowerPivot does not replace them.

    2) PowerPivot, does, however, enable reporting and analysis that mashes core DW data together with other data sources that realistically would never make it into the DW. SharePoint lists. Excel workbooks. Hand-typed data. Web sites.

    For an example of this, see https://powerpivotpro.com/what-is-power-pivot/ where I showed a mashup of AdventureWorksDW data with weather data I got from the web. Core biz data from the warehouse blended with data that likely never makes it into the DW.

    The weather data itself is available at https://powerpivotpro.com/2009/10/28/us-meteorological-data/

    3) Not all biz logic will ever be captured in the DW. In my football example, very little biz logic exists. I have to add it. Adding that, and building reports over the top of it – enabling the biz users to take that on is a very good thing.

    4) IT is now free to do even more DW work, to better serve the needs of the entire organization (for clean, standardized, core business data).

    5) If PowerPivot increases the percentage of time spent on DW by virtue of greatly shrinking another piece of the pie, I am happy to hear that. A good problem for us all to have 🙂

  5. Rob is absolutely right.

    A few points:

    1. PowerPivot is not a replacement for a data warehouse. Just like a cube is not. Organizations will need to continue to invest in producing good, clean integrated datasets and this is what the data warehouse is all about.

    2. PowerPivot is a great way to work with the data that originated in a data warehouse. It is a great way to explore and interact with the DW data. It allows you to easily enrich and add business logic to the data. It also offloads some of the processing load from the DW.

    3. Just as important, PowerPivot allows you to integrate other datasets with the DW data. This is critical as often the user needs to reference external datasets and getting those into the DW is organizationally and technically complicated and often undesirable.

    4. Lastly – using the IT monitoring tools of PowerPivot, the DW administrators can now identify useful datasets that users are often using and make an informed decision about adding those to the DW, closing the loop.

  6. Thank you both for the clarifications.

    A further comment. I can foresee a lot of folks using PowerPivot to avoid building traditional cubes, especially in smaller organizations. For example, it wouldn’t surprise me to see businesses building multiple PowerPivot models in lieu of what would be perspectives in SSAS, even though this approach could result in a lot of duplication of measures across models.

    1. Another way to look at it: A lot of PowerPivot apps will show up in places where Cubes means just a geometric shape.

      This is really the big promise of PowerPivot – letting BI reach everywhere, into places where there was no awareness of BI tech or budget to do traditional BI or time to spin an official project.

      All of these new PowerPivots apps are green-shoots for new corporate BI applications as they grow, evolve and become more strategic.

Leave a Comment or Question