skip to Main Content

By Avi Singh [Twitter]

This post is based on a query that I got in our monthly Q&A session held for our Online Class attendees.

Input = Multiple quotes for different Products from different Vendors
Desired Output = For each Product show the top three quotes, both price and the Vendor name


Go from a list of Price Quotes to showing the lowest Vendor Quotes for each Product

A bit more on the Q&A session before we dive in. All our Online Class attendees are invited to a monthly Q&A session, in order to support them in their Power BI journey. Often what you learn in class, you would only apply sometime later. With the Q&A session, if you run into issues or have any questions, you have the opportunity to bring it up and discuss with your instructor. You can sign up for our upcoming Online Live Class on August 3-4.

Step 1: Structuring the Tables

We would clean things up and import the data into separate data and lookup tables. This may seem superfluous for the sample data set, but a real data set could have a lot more rows in the data table and a lot more columns (attributes) for the lookup table. Hence separating the data table and lookup tables is always a good approach.


Our Vendor Quote data loaded as separate Data and Lookup Tables

Step 2: Brainstorm Approach to Writing Measure

When I first heard the problem I cycled through the following options:-

a) Simple Excel Pivot: I dismissed it almost as soon as it crossed my mind. If we simply drag Product and Vendor on rows and columns in a Pivot table, we get a picture but it’s not close to our desired output. I am sure there are ways to do this in Excel, but we need the magic of DAX Measures = “Define Once and Use Everywhere”. Where, we define them once and can use them in any pivot of any shape.


A simple pivot is unlikely to get us the desired output

b) Minimum Quote: My first thought was to determine the minimum quote

MinQuote:=MIN(VendorItemQuote[Price Quote])

And then write another measure find the next minimum quote and so on. But this felt like too much work. What if we had to show the top 10 Vendor quotes and not just top 3?

c) TOPN: Whenever I hear “top 3”, “top 10” etc. I gravitate towards the TOPN function. But this didn’t seem right either. TOPN would give me a set of top-ranked items from which I would still be left to determine the 1st, 2nd and 3rd.

d) RANKX: Next the RANKX crossed my mind and it felt like a jigsaw puzzle piece slid into place. So on we go with RANKX.

Step 3: Write RANKX Measure

Here is the anatomy off the RANKX function:-

RANKX(<table>, <expression>[, <value>[, <order>[, <ties>]]][, <expression>[, <value>[, <order>[, <ties>]]]]…)

Ascending/Descending Order: Here <order> can be
0 Descending Order (default)
1 Ascending Order

Typically, I skip the <order> parameter since the default (descending order) is what I’m looking for. So when determining my top customer, top product, top sales representative etc. the #1 rank would be with the highest value (e.g. the highest Sales).

But in this case, the lowest price quote is the best hence should be ranked #1. Therefore I would choose an ascending order (<order> = 1) for RANKX.

Attempt #1: Here is my first attempt, and there is a problem right away.

Rankx_Try1:=RANKX(ALL(Vendor), [MinQuote],,1)


Even Vendors with no submitted quotes are showing up in RANKX result

All the vendors are getting ranked, even the ones which never submitted a quote for that product. Fine we will suppress the vendors without quotes in attempt #2.

Attempt #2: Suppress RANKX where no quotes.

Rankx_Try2:=IF(ISBLANK([MinQuote])
, BLANK()
, RANKX(ALL(Vendor), [MinQuote],,1)
)


Oops! Ranking starts from #21 instead of #1 ??

This isn’t quite right either, since my ranking is not starting from #1, 2, 3 and so on. The Vendors which we just suppressed are still being factored in when RANKX is running; the suppression is after the fact.

I need to filter these out in the table RANKX is iterating over so that these are not considered to begin with. Remember RANKX is an iterator similar to other X functions (SUMX, AVERAGEX…). And we do so with this formula

Attempt #3: Only run RANKX on valid Vendor rows

VendorRankByQuote:=IF(ISBLANK([MinQuote])
, BLANK()
, RANKX(
FILTER(ALL(Vendor), NOT(ISBLANK([MinQuote])))
        , [MinQuote] ,
,1
)
)

The key difference is highlighted in bold. Now, instead of ALL(vendor) I am using FILTER to select only the Vendor rows with a valid quote. Here are the results:



RANKX is finally looking good!

Step 4: Writing the Final Measures for the Desired Output

Future Proofing Your Model: I often try to go beyond what is asked of me when building Power Pivot models. I try to think of not just what the customer has asked, but also what they might ask in the future. And I attempt to design my data model to be elegant and flexible enough to be able to answer new questions thrown its way. Over time I expect to tweak things a bit, write few new measures, but the core structure of the data model should remain stable. The only big changes being when we add new data sets – e.g. we start with the Sales data in your data model and then you bring in let’s say your Budget or Service Calls.

Back to our problem…In this case even though the user asked only for the top 3 quotes I chose to implement this using a disconnected table. This would allow me to easily extend this to top 5 or even top 10 quotes.


Adding a disconnected table to easily show Top N Vendor Quotes in a Pivot

We write a harvester measure to get the Selected Rank from our disconnected table, and then a new measure to get the corresponding Vendor Quote (based on SelectedRank).

SelectedRank:=IF(HASONEVALUE(RankTable[RankNum])
, VALUES(RankTable[RankNum])
, BLANK()
)

RankVendorQuote:=CALCULATE([MinQuote] , FILTER(
Vendor
, [VendorRankByQuote] = [SelectedRank] )
)

[SelectedRank] essentially returns 1, 2 or 3. Then we use FILTER to find the Vendor with the selected rank. Here are the results:-


The Top 1, 2, 3 Quotes shown in our Pivot

To be able to display the actual Vendor name the first write a helper measure and then our final measure on the same pattern as RankVendorQuote above.

SelectedVendorName:=IF(HASONEVALUE(Vendor[VendorKey])
, VALUES(Vendor[Vendor Company])
, BLANK()
)

RankVendorName:=CALCULATE([SelectedVendorName] , FILTER(
Vendor
, [VendorRankByQuote] = [SelectedRank] )
)


The Top 1, 2, 3 Vendors with the lowest quotes

Here is the pivot with both measures, getting us to the desired output:-


Our desired output with both measures on the pivot

Epilogue: Interestingly the person asking the question was asking me whether they should transform the data table itself using Power Query to get the desired output. This is becoming a very frequent question in our training classes,  Q&A sessions and even our consulting work – which tool should I use? Should I use Power Pivot for Power Query for my task? I’ll save that one for my next blog post.

Power On!
-Avi Singh

Avi Singh

Avi Singh has personally experienced the transformation and empowerment that Power BI can bring - going from an Excel user to building large scale Power BI solutions. His mission now is to share the knowledge about Power Pivot and Power BI.

This Post Has 9 Comments
  1. Great post, can’t wait to give it a go. The disconnected table has saved me more times than once when doing a comparative analysis on time elements. I’ve set up MAX() equations on the time elements and calculate comparing the value table to the disconnected table.

  2. Avi,

    Thoroughly enjoyed following your deep dive into both advanced DAX and advanced design principles.

    A subtlety I think worth mentioning is the post’s reference to “the suppression is after the fact” could be reversed to say the “inner ranking is done before the outer IF function”. I think this “reversed” perspective is valuable to come to terms with how the formula engine executes at least some (if not many) nested calculations: the formula engine starts from the inside and works outwardly.

    Often reading complex, “multi-level” formulas from the inside-to-outside has helped me understand them where reading a formula like a sentence or paragraph in English (from left to right, from top to bottom) has not.

  3. Hello Avi,
    I was just working on a similar issue yesterday, but my requirement is for data validation. My people would like to see any outlier values in the data set so I am looking to do the reverse. I want to display the highest value, the product, vendor, and date entered. Since my data set has about 1.5M records, I don’t believe the disconnected RankTable is the way to go, unless I am missing something, which would not be a first time.
    What would be the best way of accomplishing this?

  4. Avi,

    Be careful when using the RANKX function to pick the first, second, third element etc. The reason why has to do with the possibility of ties in the dataset.

    You show the syntax of the RANKX function as this:
    RANKX(, [, [, [, ]]][, [, [, [, ]]]]…)

    The correct syntax is this:
    RANKX(, [, [, [, ]]])

    This means that you can’t rank the dataset using more than one criterion. So in a case of a tie you would end up with either the same rank for some vendors and/or some ranks will be missing altogether. (Depending on which value you give the ‘ties’ parameter). The mistake in the syntax formulation goes back to an error in the documentation from Microsoft. I have even seen this error copied into books from otherwise well renowned authors. 😉

    I fell in this trap myself when I was hired to setup a power pivot model with the aim to analyze voters shifts from leftwing parties to rightwing parties and reverse on a regional basis by the then recent Danish elections. I had a pretty good idea as to how I would solve this (more or less the same way you do in your blog post). So I checked the documentation and saw that the RANKX could rank with more than one criterion and informed my customer that this would be an easy one to pull (not in those words though 😉 ). Not so. RANKX could only rank with one criterion and the polling stations on the small Danish islands had a terrible tendency to tie – two parties in between.

    It took me some hours and a lot of coffee to figure out how to do it right with 2 TOPN’s and a tiebreak rule. Since the budget wasn’t that big in the first place I wasn’t a particularly happy DAXer.

    The mistake in the documentation has been corrected now (at least in most places).

    I haven’t checked your example fully, however since the expression …

    FILTER(
    Vendor
    , [VendorRankByQuote] = [SelectedRank]
    )

    … could return 2 or more or none vendors, the measure ‘SelectedVendorName’ could return blank in the case of a tie between 2 or more quotes for the same product.

    Btw: the new DAX function TOPNSKIP() in the next version of DAX (DAX 2016?) will make this problem easy to solve.

Leave a Comment or Question