Guest Post by Elin Ramsey
PowerPivotPro Intro: I love real world examples. They really show how the tool is being used and the challenges you may encounter in a real world situation (like not having access to all the data you need, read how Elin tackles that).
Note how Elin slowly builds this example and keeps layering business logic in the DAX formulas. This comes naturally to the Business User who is intimately familiar with the workings and what is actually needed from the data. Throw in a third party and suddenly the process becomes much more burdensome as two parties need to communicate back and forth. In fact majority of the cost of any project (including BI projects) is Communication, read more here.
I will go out on a limb and say “Business Intelligence belongs to the Business User”. The BI team is necessary in building the plumbing, but Business User should firmly be in charge of building the faucets. And build faucets before the plumbing. That is one of the stops in Rob’s amazing story (read or watch) that he recounted at PASS BA Conference. Here’s to building faucets! Elin, take it away…
My name is Elin Ramsey and I work as the Analytics Manager in a call center. Click here to see more about me. I’ll be reviewing a real world application of SUMX on an IF statement, as recently described here.
One thing I need to be able to estimate is the number of active customer support representatives we have on any given day. I can’t use HR or timecard data as I don’t have access to it. Another challenge is that some representatives will sometimes be acting as trainers most of a day and we don’t want to count them then.
I’m going to walk through what my definition of active is, how we set up the data, calculating whether a representative is active on any given day, using SUMX to calculate the number of active reps on any given day, and then a couple of examples of what we can do after we have that formula.
I’ve attached a sample data file with some not very realistic toy data. It ranges for a month and a half and has 10 representatives.
The first step is defining a definition for “active.” For this example I’m going to define the following conditions:
- They must close more than 2 cases
- They must hit either over 3 hours on the phone or over 5 emails sent
Two Dimension Tables:
- Owners: Owner ID and Name
- Dates: a small standard dates table
Three Fact Tables:
- Phone: has Owner ID, Date, and the Talk Time
- Case: Owner ID, Date, Case ID, and one property – Origin
- Email: Sent by ID, Date Sent, Case ID
Email is an odd table because Sent by ID is usually, but not always, the Owner ID on the parent Case. Most of the time for other analysis, I use a relationship to Case for counting emails. For this situation, though, I have created inactive relationships directly to the Owners and Dates tables. See here for more help on inactive relationships and the USERELATIONSHIP() formula.
Building the measure
First I have the base formulas to find the base productivity measures.
Time spent on the phone:
[Total Talk Time (hr)] = SUM(Phone[Phone Talk Time (hr)])
Count of Emails
[Email Count] = COUNTROWS(‘Email’)
Emails sent using the inactive relationships
[Email Count by Sent By]
USERELATIONSHIP(Email[Sent By],Owners[Owner ID]),
[Cases Closed] = COUNTROWS(‘Case’)
Next is the IF statement that will be the base of the SUMX. This returns a 1 if the criteria defined at the top are met and 0 if they are not.
= IF (
,OR( [Total Talk Time (hr)] > 3, [Email Count] > 5)
Is Active calculates correctly for Monique per business rules
however Grand Total needs some SUMX love
We can see that Monique isn’t considered active on 1/2/2014. Because we know the criteria, we know it is because she isn’t above either 3 hours on phone or 2 emails sent.
If we look at the total, we see that Is Active is returning 1. This is calculating correctly as 54 cases is bigger than 2 and Total Talk Time and Email Count by Sent By are both over their criteria. However, we want the number of representatives who are active on this 1/2/2014 – which is 6. This is where SUMX comes in.
[Active Reps] = SUMX(VALUES(Owners[Owner ID]),[Is Active])
VALUES(Owners[Owner ID]) says to return a list of all the Owners. SUMX then takes that list, evaluates for every value in that list, and sums it all up.
We can see that the correct value, 6 , is returned on 1/2/2014.
Active Reps uses SUMX to calculate the correct total
We also see that there are 0 active reps on 1/4/2014 and 1/5/2014. This is because my toy data is built to have very little activity on the weekends, so no reps hit the activity marker. There is also a lot of fluctuation that is occurring in the number of Active reps on the weekdays. This is purely due to the setup of my toy data, which is not realistic.
Active Reps is a great measure to build off of.
Want to know the average number of active reps? AVERAGEX the SUMX
[Average Active Reps] = AVERAGEX(VALUES(Dates[Date]),[Active Reps])
Want to know the total number of active days? SUMX the SUMX
Right now the formula doesn’t give good grand totals. It is calculating the total number of reps who, if you added up all their activity over the time period, would be considered active. In this case that’s everyone, though this might be useful in cases when the workforce is expanding or decreasing.
To get around this, add another layer of SUMX around.
[Active Rep Days] = SUMX(VALUES(Dates[Date]),[Active Reps])
I also use this to calculate the number of cases per rep. A first pass at this question would yield
[Cases per Rep] = AVERAGEX(VALUES(Owners[Owner ID]),[Cases Closed])
This runs into problems, though, where we count people who closed one case and they bring down the overall average. Our call center also counts people like trainers and leads as overhead who improve efficiency. We want to be able to say with the help of everyone, “how many cases get closed with who we have on the floor?”
The solution is to use the [Active Rep Days] formula above.
[Cases per Active Rep] = DIVIDE([Cases Closed],[Active Rep Days])
This has the extra benefit of having already done the work of making the Grand Total make sense.
SUMX on an IF statement is a very versatile structure that likely has a wide array of applications where you need to count the number of things that are in a given state.
I’ve included the sample file along with this post: sample data file.