skip to Main Content

So you are getting a data warehouse for Power BI banner

Data Warehouses: What Do I Know?

Well, I’m no expert. I learned Power Pivot and Power BI through Excel. What I know is from helping to build a SQL data warehouse. That is, I’ve worked with a data warehouse pro. So, BI pros or others, please add your observations in the comments!

Data Warehouse: the Promise.

In 2009 on Twitter, Rob Collie noted that the data model in Power BI and Power Pivot: “very much is meant to work in conjunction with, and be fed by, well-designed data warehouses.” (Power Pivot’s Impact on BI Pros). For me, a good data warehouse enables business pros to create reports without needing to think too much about the data model. Another benefit I see is the ability to deliver operational reports from transactional databases (ERP, CRM, financial systems). By design, these systems are for logging events, not for creating reports.  And yes, even Salesforce is for logging tasks. Reports are secondary.

Data Warehouse: the Hype.

I’ve heard some idealistic claims from data consultants and Microsoft TSPs. For instance, data warehouses will soon replace Power Query and data modeling. Really? When I hear these claims, I tend to smile and nod. Why? Because, business needs and data sources evolve and the data warehouse lags behind these needs. Another benefit is having a single source of truth. For me, the real promise here is empowering business pros to get reliable data.

Data Warehouse: a Couple of Things.

  1. Star Schema. Data warehouses use a star schema. In a star schema, data falls into lookup tables and transaction tables. Or, for the data warehouse pro, dimensions and facts. Typically, transaction tables hold time and event-based data. Whereas lookup tables hold the descriptive details for contacts and accounts (attributes). For more info about star schema, see Data Model: Beast to Beauty. One lookup table can contain data that’s in multiple hops in the source database. For example, transaction type code will be one field and its description will be another field– instead of in a separate table.
  2. Historical Analysis. The purpose of a data warehouse is historical analysis, and this shapes design decisions. Since longer text fields won’t aggregate in a simple way, the first urge is to drop them.
  3. Platform Agnostic. The data warehouse doesn’t care how users connect. Users can use Power BI, Excel, Tableau, etc.
  4. Other Abstractions. Data warehouses have other abstractions that may confuse end users. For example, how they handle blank values in data (nulls) or how they handle transactions that don’t connect to lookup tables. And, the data warehouse may use internal IDs not found in the source. While I’m less concerned with these abstractions, they are good to be aware of. A data warehouse will involve a learning curve.

5 Practical Suggestions for aligning a data warehouse with Power BI.

(your mileage may vary)

To begin with, views are critical to Power BI. Views are versions of tables. Typically, these views are for specific purposes. While a good data warehouse person won’t adapt tables to Power BI, they will adapt the views. But you have to know what to ask for.

  • First, depending on how your data warehouse handles nulls, your dates may not be continuous. If your data warehouse replaces nulls with 12/31/9999, the date table will have gaps. The view for the Date lookup should filter out this invalid value. While you’re at it, be sure that all of your date fields have sort columns. A Month Year column formatted Apr 2020 will need a Year Month column like 202004 that will sort.
  • Second, naming conventions. Some platforms (Tableau, for example) can’t recognize spaces in table names and/or column names. Because of this, the data warehouse may use CamelCase for everything. While Gil Raviv has a query for fixing camel case column names, it’s best to fix this in a view. Similarly, prefacing every table name with the word Fact or Dim can make a data source alien to a business pro. Rename in the view. Always prefer clear human readable names for Power BI.
  • Third, long text fields in transaction tables. Long text fields can snag performance in Power BI. But you may need them for operational reports. One solution is to truncate these fields to 256 characters or less. Another approach is to add an index to the transaction table and split the text off into a special lookup table. However, with this second solution, see below.

About Relationships…

Data warehouse: lookup tables connected to a data table
Behold, the lovely star schema
  • Fourth, prevent one-to-one relationships. (Only you can prevent monogamy). With a great data warehouse, Power BI will set up the relationships between transactions and lookups automatically. What can block this automatic setup are tables where both sides have unique IDs. I prefer to add two invalid rows with blank data to the table that needs to be on the many side
  • Fifth, decide what to do about tables that may have multiple lookups to the same table. Your sales table may have a Sales Date and a Ship Date. Because Sales Date is the usual link between the tables, I prefer to rename Sales Date to Date Id (or whatever you Date key is called). This enables Power BI to create the main relationship without the business pro having to think about it. Other cases are when there is a credit account and a debit account. The user can always create a second inactive relationship to the other field.

Note on Relationship Auto-detection in Power BI.

Power BI can match up lookups to data tables. Automatically. Sadly, Excel Power Pivot can’t. Power BI recognizes lookups because they have a column with unique values. It links lookups to transactions on identical column names. Power BI will create one-to-many relationships. It does not attempt to create many-to-many relationships. Thank goodness! It will, however, create one-to-one relationships. Ruh oh. This is because it has no way to determine which side is NOT the lookup side. This problem is what the Fourth suggestion above remedies. What’s so tricky about one-to-one relationships is that it throws off auto-detection of all of the other relationships. And there are other results in the data model from these dreaded one to one relationships.

Microsoft’s platform is the world’s most fluid & powerful data toolset.  Get the most out of it.

No one knows the Power BI ecosystem better than the folks who started the whole thing – us.

Let us guide your organization through the change required to become a data-oriented culture while squeezing every last drop* of value out of your Microsoft platform investment.

* – we reserve the right to substitute Olympic-sized swimming pools of value in place of “drops” at our discretion.

Fred Kaffenberger

Fred Kaffenberger

A teacher at heart, Fred loves that moment when someone struggling to solve a problem makes a breakthrough. Fred spent six years in client services and sales for an online work order system. After this, he used Excel to help streamline commercial real estate operations for a major telecommunications carrier, where he discovered Power Pivot and Power BI. He was thrilled at how these tools helped him work smarter and more systematically. As an English major, Fred knows that a knack for working with data can surface in surprising places.

This Post Has 13 Comments
  1. Gracias Fred, muy interesante, no soy experto y me adentro en este nuevo mundo y me ayudo mucho saber que power bi si puede hacer las relaciones automaticas siempre y cuando estructuremos bien nuestra tablas, muchas gracias. saludos.

  2. I am in the camp, Power BI replaces the need for a data warehouse! Provided that you have good understanding of the underlying transactional data bases and access. That coupled with some SQL skills to write the views necessary to get the data in shape that is Power BI works best with. You are good to go!

    Full disclosure I have been doing reporting and working with my companies data for 25 years without a data warehouse. I enjoyed the article.

    1. The data warehouse is itself another tool that has its own processes and approaches, which are not sufficient in themselves for self-help bi. You need views, and then ideally a semantic layer. A layer which helps users connect with the meaning of the data. So, either Power BI datasets that everyone can connect to. Or, a model in SSAS tabular.

      1. Hi Fred,

        Thanks so much for writing this. I am a BI Developer for a company that is implementing a DW with Azure analysis Services as semantic layer. The hope is that they can provide self-service approach to Power BI for all users in the company.They may leverage perspective as part of this.

        Though there are a few folks who think that we can achieve self-service visualization by avoiding the semantic layer of Azure analysis services and instead creating SQL views in more of a flat file form which users can connect to according to their role.

        Do you have strong opinions on either? I strongly lean toward the first option of using azure and then having BI developers create the models for consumption. Similar to sharing data sets in the app.

  3. Thanks for sharing your insights, Fred. I like your concern for the business users when it comes to naming schemes.
    Self-sufficient user adoption is always a challenge – especially with Sales folks.
    What’s your preferred method for getting data from Salesforce to warehouse?

    1. Thank you, Dory. I believe that SQL Server Integration Services (SSIS) will work for this. Salesforce data, however, is pretty well structured as is, so I typically connect directly to Salesforce and then refresh the data on a schedule in Power BI Service. And when I was in sales, I was always highly interested in seeing how my performance was, so I would refresh my Salesforce dashboard, or before that check the paginated reports.

  4. Great read and awesome to find like minded individuals who still reference the importance of a data model and true data warehouse. Transactional reporting only works to an extent. It does not work when you have a ton of disparate data sources, different granularities and similar but inconsistent versions of the truth.

  5. Data Model: Beast to Beauty changed me forever. Good shout out there. I currently use 1:1s quite a bit in my DWs. That hasn’t been a problem for me, but perhaps it is because our Power BI reports are direct link to SSAS and there is therefore no model (and relationships auto-detected) generated in the report. One benefit of DW that is perhaps overlooked here is that a LOT of data can be pre-calculated. Previously we had PowerPivot and any gnarly DAX calcs just destroyed the time it took to refresh the model. Now I can calculate those at night, weekly, monthly, etc and not put the burden on the report runner. Of course, that brings up a downside of DWs, which is that they can have ‘stale’ data. Good article though. I like these design, best practices type articles.

  6. When I hear about DW I think about SAP BW then the challenge of connecting Power BI and other tools gets interesting. If this is your case, I have some tips for you. Recently a new connector to BW was released and hopefully that works fine but I could not yet successfully test it and I expect to still have limitations to my needs, anyway what my IT department did for me was to extract a bunch of SAP tables into a SQL db with all meanliness thechnical names then create SSIS packages to normalize the data then I can create views to filter and make additional tranformations before load data into a SSAS tabular model. Nowadays I realized that if I maintained the original structure of SAP data then I could leverage MDX query code generated by SAP to replicate report results and bypass the need to normalize the data. Also, to bypass SAP RFC limitations to extract large amounts of data we can leverage Power BI functions to extract data in chuncks.

Leave a Reply

Your email address will not be published. Required fields are marked *