Data Modeling for Power Pivot and Power BI

February 16, 2016 at 2:58 am

with big data models the power pivot / power bi mapping area can get very crowded. I know there is the zoom in/out feature but its still messy…is there an easier way to handle this ?

February 16, 2016 at 9:12 am

Not that I am aware of, David.

February 25, 2016 at 1:03 pm

David, the best way to handle this is to create Perspectives. I have worked with Models approaching 100 tables. Creating different perspectives is the only way to keep yourself sane.

February 25, 2016 at 1:42 pm

Perspectives….I thought they are more targeted at the end consumer of the datamodel in their client application to make their life easier not for the developer in the powerpivot backend datamodel development area…?
But i’ll check them out all the same.
Thks David

February 16, 2016 at 8:38 am

Nice article but the missing piece is the detail on how you actually create the look up tables. Still reading the new book and still reading some blogs but still hazy on the subject.

February 16, 2016 at 9:12 am

Short version: Power Query can be used to create lookup tables, particularly with its Remove Duplicates function. And as you suspected, yes, this is covered in a chapter of the new book that you have not yet reached 🙂

February 16, 2016 at 2:57 pm

For our organization we can’t assume that all tables, data or lookup, are coming from a source where we can use SQL to design a better star (or snowflake). Power Pivot is valuable to us because we can pull together fragmented bits of operational data from legacy systems and link them with Power Pivot’s chewing gum and baling wire.

As Rob notes, Power Query is vital to this effort. Unfortunately it’s hard to find the time (and brain cells) to get to be good in both Power Query and Power Pivot. At least in Power Pivot we’re always operating against an Excel paradigm, but for Power Query you’re cobbling together a unique ETL solution with every legacy data source you want to incorporate. I’d turn it over to the data warehouse gurus if they weren’t already busy with their responsibilities! I appreciated the much-improved chapter 22 in the new edition because I can’t avoid having to make filters run “uphill”. Without serious chops in Power Query however I have to traverse multiple uphill and downhill relationships to make connections, usually with a few calculated columns helping out in the intermediate tables. I’m hoping there’s a better way once I get as familiar with the tools as Avi and Rob, but I’m going to need a few more books from them!

February 25, 2016 at 1:06 pm

GMF, have you gotten to using Power BI Desktop? That has built in support for “bi-directional” relationships (downhill and uphill filtering). Which could be really handy in your cases. In my experience, cases where I needed that have been somewhat rare, so I haven’t been big on that.

February 16, 2016 at 4:20 pm

Nice article, but I believe I read somewhere that a good place to find the new book is at your local Barnes and Noble. 😉

February 16, 2016 at 4:32 pm

>>Think you are just creating a quick analysis model for yourself? Power Pivot is so powerful that you would often create models that last longer than you ever expect, be used by more people than you ever imagined.

THIS. +1000.

I use a certain model that was originally intended to solve one particular business problem, but that has Borg-ishly evolved into a much larger entity that consumes data from a variety of sources to answer a boatload of different business questions daily. In hindsight, I would’ve done things a LOT differently upfront had I even considered it would eventually grow the way it did.

February 25, 2016 at 1:09 pm

Justin, believe me that lesson was learnt the hard way – same as you have. My first model started off as a garage project and went on to be used by 600+ users. Unfortunately some of my early work had things like “Calculated Column1” or “Measure1″ which I could not take out since they were being used in reports. Ugh! Nasty. Of course I know better now…even in the training class we conduct, I don’t let my students leave names like Calculated Column1”.

Shawn Brown · February 17, 2016 at 6:49 am

Excellent article. You reference the first book I ever purchased on Data Warehousing. Great story Avi!

February 20, 2016 at 9:01 pm

Great article Avi. The more detailed explanation of lookup tables with data tables is just what I needed. This section of the book is dogeared for me. You guys are awesome.

February 22, 2016 at 2:22 pm

Hi Rob,

Thank you for an outstanding book. I currently use heavily formatted sheets for reporting purposes, so historically I have used tables for input instead of pivot tables or charts. I am also aware of PowerView and powerBI but for some of our reporting, we need more flexibility. I know that I can turn my pivot table into cube formulas to move around, but it seems that these formulas are often times hard to intuitively understand. What do you say to someone who is wanting the benefit of a data model, but finds referencing pivot tables cumbersome?

February 22, 2016 at 2:34 pm

Hrm, you don’t like pivots or cube formulas as output eh? What would be ideal, in your eyes?

February 22, 2016 at 6:08 pm

Hrm, you don’t like pivots or cube formulas as output eh? What would be ideal, in your eyes? Structured References for PivotTables would be ideal. I posted some pretty robust code at Chandoo’s blog a while back that should fit the bill:
https://chandoo.org/wp/2014/10/18/introducing-structured-references-for-pivottables/

I’m in the process of turning this (and a lot more besides) into a commercial add-in that I’ll be launching with my book. The book isn’t specifically targeted at the PowerPivot end of the spectrum, but a lot of the tools I cover are just as helpful on PowerPivot PivotTables as they are on traditional ones. Including a revised implementation of Slicers that will also let you do all sorts of crazy WildCard filtering against Pivots that you simply can’t do out of the box. I even have someone at MS using this on their huge OLAP pivots, because the native filter takes upwards of a minute just to populate on their massive dataset, and half the time disappears as soon as it finally comes up.

Sneak preview at https://dailydoseofexcel.com/archives/2015/11/17/filtering-pivottables-with-vba-deselect-slicers-first/

February 22, 2016 at 6:10 pm

I’m not sure I agree Jeff, but I don’t necessarily think it’s a bad idea either.

Let’s see what the OP says 🙂

February 22, 2016 at 8:05 pm

Part of the issue is that whatever solution I come up with is to be shared among my colleagues, so for troubleshooting it can’t be too complicated. The Cube formulas are also lacking for pivot tables that are not fixed in size. I’m still trying to think of a solution and if you feel their is something more appropriate or feel that my view is short sighted then I’d love to hear it. So far, I’ve been using Power Query to pivot some of the data (I know, I know), along with Custom formulas in M to steer the transformations, but I’m worried that this will be too slow and I’ve really had to control the data coming in. One major issue is that all of my data is coming from .csv files instead of a database. Thanks for the feedback.

February 22, 2016 at 8:07 pm

Hmmm… very interesting idea. I’ll have to think more about it.

February 22, 2016 at 9:42 pm

Basically I use the Structured PivotTable References approach when I need a report that doesn’t look like a PivotTable. I put all the source PivotTables I need in a hidden sheet (and sometimes I make up quite a few PivotTables that contain the various extracts/combinations that I need), and then just reference the Structured PivotTable References that my code autogenerates. These get updated on refresh automatically in the blink of an eye, and this allows me to do pretty much what I want with that PivotTable output. Seems pretty bulletproof to date, just like the inbuilt Structured Table References are. But it all depends on what you are trying to achieve.

Granted, I use these references to overcome the limitations of the non-OLAP GETPIVOTDATA function, and I understand that you PowerPIvot folks have much more that you can do with CUBE functions. But I also imagine that those CUBE functions might not be as transparent as say a simple view of a PivotTable with a clever dynamic named range pointing at the bits you need.

Feel free to reach out to me at [email protected] if you need help with my code, or if it doesn’t play nicely with PowerPivot Pivots.

February 22, 2016 at 3:20 pm

Jeff, I feel your pain too. YOu convert the OLAP to formulas, then you move the formulaes and you forget which measures apply to the formulas and which filters drive those formulas….

What i do is create the pivot tables twice and keep a master copy as an original pivot table in a ‘reference’ tab. The other copy i convert to formulas and move around as you say with full flexability and insert comments to remind me which original pivot table the formula relates to. This helps when you need to recall which pivots/measures/dimensions the formulas originate from and which filters where targeted at those formulas by simply going back to the reference tab to find the original pivot. I find it works.

Of course this solution is without opening a chapter on the benefits of PowerBI, or even the option of exporting the power pivot datmodel into SSAS for SSRS flexible reporting….

February 22, 2016 at 7:44 pm

Very good points as well. PowerBI might work if I could print off a 20 page report like I can in Excel, but then again, I need more flexibility in the design and formatting. As for the other solutions, I unfortunately don’ t have the resources for either option. Thanks for commenting

February 23, 2016 at 6:17 am

I would use GETPIVOTDATA referring to parameter cells so everything updates automatically when you adjust the parameters. If the dataset isn’t too big you could also use SUMIFS with structured references to a table. Both approaches usually work fine for me.

February 22, 2016 at 4:31 pm

Excellent Article!! 🙂

February 27, 2016 at 9:15 am

Excellent post guys and this book sounds exactly what I’ve been looking for. I’ve had a play (and been on the MS course) with Power Query, PowerPivot, Power View and Power BI and I am wanting to venture more down this road as it’s the future. Book ordered. Looking forward to its arrival.

ps One problem with our organisation currently, is we’re still on Excel 2010 (except me and a few others) so I’m assuming a meeting with the Head of IT to get a Company-wide rollout out of Office 2013, or better still 2016 is on the cards and a must if we are to encourage self-service BI. Agree?

February 28, 2016 at 7:48 pm

Jon, enjoy the book when it arrives 🙂 Do leave a review on Amazon when you have read enough.
On Excel 2010/Excel 2013/Excel 2016
My take is – who knows how long it would take to deploy Excel 2013.
Power Pivot & Power Query with Excel 2010 can still do serious damage. Check out some of Rob’s early posts (Excel 2010 era) and hear some of his stories. He was saving companies millions of dollars with Excel 2010+Power Pivot.
So I say, let’s start the engines! Go as hard as fast as you can with Excel 2010. If you do start hitting roadblocks which can only be resolved with Excel 2013/2016 then by the virtue of all the good work you would have done – you should have tremendous support behind you for those moves.
By all means, have the meeting with the IT head. But don’t hold your breath till IT rolls out 2013; start the good work now with Excel 2010.

Power On!

April 12, 2016 at 2:18 pm

Thanks a lot for your reply Avi. I shall Power On! 🙂

May 25, 2016 at 9:12 am

I find that the issue isn’t so much which version of Excel (2010/2013/2016) as the fight to get a 64-bit install on your machine. Without fail, you’ll run into the 32-bit limitation as soon as you try using real-world data. (Your time learning in 32-bit with example datasets like AdventureWorks, etc. isn’t wasted, though.)

PowerBI Desktop has been very useful in this regard, as the 64-bit version will install without caring about the “bit-ness” of your Microsoft Office.
(The one exception I’ve found is that it can’t read MS-Access .accdb files. There’s still some link that causes the 32-bit driver to be invoked… come on Microsoft! Any work-around suggested to date is kludgy, at best; and I’d never ask someone else to use those approaches on a recurring basis.)

March 22, 2016 at 6:17 am

Disagree. It is possible to encourage self-service BI without using any version of Excel. This can be done with just PowerBI.com or with Power BI Desktop and PowerBI.com

April 12, 2016 at 2:22 pm

I agree with you too Gary. You don’t need Excel as a standalone at all.

March 9, 2016 at 12:33 am

When working on very large data models with many tables/sources, I’ve often thought that it would be nice to be able color code the tables based on source (blue=SQL, red=Marketing’s SharePoint list, purple=Bob’s Excel file, etc.). I’ve never been able to figure this out and have even gone so far as to take screen shots of the model and drop them into PowerPoint and add colors there for a quick-reference guide. Anyone have any ideas on how it might be possible to do this natively in the PowerPivot window?

April 25, 2017 at 11:08 am

When starting with data that is a “Frankentable” (lookup information and data information in one, massive table), is it best to clean and filter my data first, and then use the “Reference” option to create my lookup tables by branching out from my filtered data table?

For example: I have “Equipment Manufacturer” as a column on my Frankentable. The report I am creating is only relevant to one manufacturer.

The “old” technique would be to query my source (.csv) for the data table, filter the manufacturer, delete the lookup columns, and then query the same source for the lookup table, filter the manufacturer, and keep ONLY the lookup columns and then remove duplicates.

Would my query run faster if I imported my source, filtered the manufacturer, referenced that query for the lookup table query, and then deleted the unnecessary columns in the lookup and data tables?

Thanks, and let me know if I am being unclear.

April 26, 2017 at 6:11 am

Please tell me it is possible to get the book “Power Pivot and Power BI” freely.I need the book without money because i am student………..

August 17, 2017 at 5:57 pm

Avi,

An excellent post. Would like to have heard more about granularity issues and solutions, but that’s probably a good reason for a second installment on model design.

Best regards.

November 13, 2018 at 3:59 pm

Thank you very much! It is a very useful fundamental info to data modeling! May you advice some you post(s) that describes how to change dax calculaltions when data model has 2 or more fact table (in my case I have same tables but differ with date period (historical data, current period) to not refresh all data and safe time.
What is difficult for me, for example, how to calculate distinct customers from two fact table and so on…

July 24, 2020 at 9:29 am

Hello,

I am using Power Query to Append 2 Data table (same columns header in both)

But one of the table is the transactions for the current year and the 2nd table are the historical transaction for the past 5 years.

I am looking for a way that when I refresh all my data it doesn’t refresh the historical data.

So far I tried to “untick” the “include in refresh all” on the Query of the historical data, but since the Append table include those data it still get refreshed.

Anyway to now refresh the historical data in a appended table?

Thank you,

Data Modeling for Power Pivot and Power BI

Pay It Forward

Lost Chapter From Our New Book: Power Pivot and Power BI

Data Modeling – for Scalability and Usability

Scaling to Millions of Rows

Scaling to Multiple Data Sets

Lookup Tables – the Who, What, Where, When, How

Matrix of Lookup and Data Tables

Crazy Lookups and Snowflake: Too Much of a Good Thing

Usability

Cancel reply