In Part I, we reviewed the new Data Model (which is really just PowerPivot baked into native Excel), discussed the data import process, and other related issues. In this part, we further the discussion of Data Model, and review changes to PowerPivot for Excel 2013.
Creating Table Relationships with the Data Model
After we create a Data Model in a workbook, we can create, view, or modify relationships between tables. When we import multiple tables from the same relational data source, Excel detects primary and foreign key relationships and by default, creates the relationships automatically. We can override the default behavior during the import process. In the Import Data dialog box for Access shown in figure 1, we can clear the check box titled Import relationships between tables. For some other data sources (SQL Server for example), the check box can be found in the Select Table dialog box (the dialog box in which you select multiple tables).
By now, most readers are aware of the new Data Model object in Excel 2013. But what is the Data Model, and how is it created? What can we do with the Model after it is created? Where does PowerPivot fit in? We will address these, and other questions in this two-part post.
The Data Model is a new Excel object that you use to create PivotTables, PivotCharts, and Power View reports. The Data Model uses a built-in version of xVelocity, which is an in-memory, column-based version of engine that powers Analysis Services. xVelocity stores data that you import in-memory (compressed), and calculates implicit measures (created by dragging a a table field to the values area of a PivotTable) and explicit measures (created using DAX functions). We can create the Data Model using one or more tables. Generally, we create relationships between tables in the Data Model, but there is no absolute requirement to do so. Another key characteristic of the Data Model is it’s ability to consume large amounts of data.
Excel 2013 is, in my view, the first version of Excel that that has fully transitioned beyond its spreadsheet roots (while still maintaining these roots) to become a BI tool that can compete with the likes of Tableau, QlikView, SpotFire, etc. Two key additions to the product has facilitated this transition: the Data Model, and the Power View add-in. Power View is a visual analytic tool – an very important component in any serious BI application.
Excel 2013 can act as a central hub for tabular BI development. For example, we can develop simple or complex analytical models that we can store centrally on an Analysis Services server (and extended with partitions and roles), or that we can store centrally on a SharePoint server. Of course, we could already do these things in Excel 2010 and PowerPivot. However, for basic models we create in Excel 2013, we don’t need PowerPivot at all. Also, Excel 2013 allows us build Power View reports that can be stored centrally in SharePoint. Figure 1 shows how the BI components are related in Excel 2010, and figure 2 shows how they are related in Excel 2013 (with changes highlighted in red).
A cycle chart is a chart designed to show business sales cycles in a manner that cannot be expressed using other standard charts. For instance, a cycle charts can show how monthly sales vary over several years, or how daily sales vary over several weeks. For a good introduction to cycle charts, see this excellent article by Naomi Robbins.
Figure 1 shows cycle PivotCharts using 3M+ rows of sales data from the Contoso database.
Figure 1 – Monthly sales over years and daily sales over weeks cycle plots (click figure to see an expanded image)
After completing the Part 3 extension of Rob’s Dynamic TopN Reports via Slicers, Part 2 post, I did not plan on a forth installment. However, I did plan to write about creating dot plot PivotCharts sometime in the future. Later, it occurred to me that the TopN reports model provided the perfect foundation upon which to create dot plot charts.
The dot plot is not a standard chart type in Excel, so most users haven’t heard of, let alone used one. The dot plot is an alternative to a horizontal bar chart, and there are many situations where it is a better fit for analysis. For a good introduction to dot plots, see this excellent article by data visualization expert, Naomi Robbins.
The following summarizes some of the potential advantages of the dot plot over a bar chart:
When there are a lot of category items on a chart, a bar chart can look cluttered. Because a dot plot uses less “ink” to represent the same data, the resulting chart tends to be less cluttered.
The dot plot is often better than a stacked bar chart. The values in a stacked bar chart can be hard to compare because only the bottom bars have a common baseline.
Depending on how the dot plot chart is organized, you can gain better insights than using a clustered bar chart (see example in Naomi’s article).
Since the absolute length of a bar chart encodes its value, the value axis must start at zero. If the values in the chart are all a distance from zero, you can’t make good use of interval values on the value axis. On the other hand, the dot plot values are judged by position along the axis – length is not involved. Therefore, you can have more optimal intervals on the value axis.
FIgure 1 shows dot plot PivotCharts based on the data used in Part 3.
Figure 1 – Dot plot PivotCharts (Click figure for an expanded view)
Guest Post by Colin Banfield [LinkedIn] In the next few posts, I plan on demonstrating techniques for creating various types of custom PivotCharts that use PowerPivot data and DAX measures. Each custom chart will have one or more of the…
After Rob posted Dynamic TopN Reports Using PowerPivot2!, I downloaded the workbook from the provided link to examine how his “tricks” were done. Shortly thereafter, I sent a message to Rob complementing the techniques he used, and mentioning the potential for using the techniques in other scenarios – especially those that include dynamic charts. I didn’t realize that I was setting myself up, because Rob asked if I’d be willing to write a post detailing one of these scenarios. For the sake of continuity, it makes sense to treat the scenario that I will be discussing as an extension of Dynamic TopN Reports via Slicers, Part 2. You should have thoroughly read and understood that post before you continue here.
The additions that I have made to Rob’s TopN reports are as follows:
Created BottomN measures for Customers
Created TopN/BottomN measures for Products
Added a Year-Month slicer
Created TopN/BottomN charts for customers and products that react to slicer selections for TopN, By (selected measure), and Year-Month range.
The final result of these efforts is shown in Figure 1.
Figure 1 – TopNBottomN charts (Click figure for a wider view)
Way back during the period of the first CTP of Gemini (which later became PowerPivot), I was working with a data set that included a column of month numbers. I wanted to create a calculated column with month names, and the only solution that I could think of at the time was using the IF function in DAX:
This a horrendous formula, with eleven nested IF functions and a long tail of closing parentheses. I promptly made a suggestion in the Connect forum for the addition of a “Case” function in DAX, siting the preceding formula as justification for its inclusion. Sometime later, I received a message from Howie Dickerman indicating that some form of Case function was being considered for PowerPivot V2 (the current version). This “Case” function turned out to be the SWITCH() function. “Switch” is a function familiar to Access and VBA users, so it made sense to stick with that name for the DAX implementation.
During the process of building the measures, I did do some testing with duplicates, but the testing was minimal and involved only duplicates at the 25th or 50th percentile mark.
To check out the reported issue, I built a new data set with many duplicate values. Figure 1 shows the results of the percentile calculations in this scenario:
Huh? What’s going on here? The 50th and 75th percentile calculations are correct, but the 25th percentile calculation is totally and utterly incorrect! I created several intermediate measures along the way to the final result, so it was trivial to track down and correct the problem.
For some time, I have been looking around for a fairly complete date table in Excel for use with PowerPivot. If you are working with data derived from a data warehouse, a date table is perhaps the most common dimension table that exists in the warehouse. However, not every scenario involves working with a data warehouse directly, and I simply wanted a “portable” date table. I found very little online, the best perhaps being this Excel table offered by the Kimball group (the table has been expanded since I originally downloaded it). I could have modified the Kimball table for my particular needs, but I decided to create one from scratch. Late last year, Rob posted an article titled the Ultimate Date Table, which is available from the Azure Marketplace. I considered using this table instead of the one I was building in Excel, but the “Ultimate Table” lacks fiscal periods. Much of the analysis work I do includes fiscal periods.
(Rob's note: Apologies to Colin, he put this up here in draft form weeks ago and even though I promised to flip it to live two weeks back, I forgot. So Colin... a thousand pardons. This is awesome!) From Colin:…