No more messing around, let’s dive in.
Rather than boiling the ocean up front, let’s set a modest goal for getting started: I want to end up with a report that shows passing yards by player. (PowerPivot lends itself quite well to this approach – no need to build an all-encompassing model up-front like with traditional BI tools. This is more like… Excel.)
First, I launch Excel 2010 (with the PowerPivot addin installed), and using the PowerPivot ribbon tab, I open the PowerPivot window:
(I’m not yet set up to do videos, but will be soon. For now, let’s make do with pics).
Then, since my football tables live in a SQL db, I use the “From Database” button, type in my server and db name, and end up here:
As I showed yesterday, there are 40+ tables in there. Some of which I probably won’t ever need. And I certainly don’t need them all in order to deliver my first report.
But it’s just so much easier to grab all of the tables now. That way I can browse them all, look at the data, and decide which tables I need for the report in a hands-on manner.
So I just click the little “select all” checkbox at the top-left. Great feature… that I argued against for awhile. I was wrong. I use it all the time. Amir and Olivier, allow me to apologize publicly 🙂
Hit Finish, and I wind up with this:
A few things to note:
- I’ve maximized the PowerPivot window. Don’t worry, I can switch back and forth to Excel as much as I want, without closing this window.
- There is one “sheet” per tab per table from the database. Gives Excel users a comfortable way to browse the data.
- I am actually seeing the data, just like in Excel. No abstract schema browsers here. We get real data, all of it, to work with.
- I have expanded the sheet navigation “overflow” popup. Given that PowerPivot will be working with a lot more “sheets,” on average, than Excel, it needed a revamped navigation concept. (We stole this from OneNote – thanks folks!)
All good. But there’s a problem here, too, that you may have noticed.
I have dirty players!
No, I’m not talking about Hines Ward. (Leave the guy alone, ok, he just “plays hard.”) I mean that my players table contains some bad data – players for which Stats Inc. did not record player names, and instead recorded numbers.
What do I do about this? The PowerPivot window itself is read-only – I can add formulas, which I’ll show later, but I cannot edit or delete rows.
The reason for this is the remarkable compression employed by PowerPivot – some db’s shrink to 5% original size. And that same compression is used, in-memory, to deliver faster query results at report and analysis time.
Jamming individual edits into that compressed structure, while maintaining lightning-fast query speed – well, let’s not go there. Outstanding compression and query perf. It’s a good tradeoff, especially given the other methods available to me.
OK, so how *can* I clean the Players table?
I have three options, which I will cover in the next post.