Friends at a company play pick-up basketball during their lunch hour. Since there are no established teams, players can be randomly matched up. But these folks happen to be engineers/data-nerds, so they keep detailed track of games, teams, players and win/loss. The diagram view of the data is shown further below.
Question: How can we determine which player pairing is the most successful?
Since players are randomly teamed up, are there combinations which when teamed up have an unusually high winning percentage?
Application: This would naturally extend to other sports, but I believe may also apply in many non-sports scenarios, where items are paired up somewhat randomly (or by design) and we want to know how effective those pairings are.
Thanks to Kirill Perian (basketball nickname K-Real), an attendee of one of our past webinars, who sent us the dataset and posed this question. Dataset has been simplified to showcase this scenario and anonymized to protect the identity of the losers 🙂
File can be downloaded here.
First we will address the easier scenario of creating metrics for individual players, using the many-to-many pattern. Next we will take on writing measures to compare performances of pairs of players.
Resolving Many to Many to create Individual Player Metrics
Let’s start with a really simple metric, count of games played.
This works until we attempt to use a field on the other side of the many to many relationship.
Between Player and Game, there is a many to many relationship. One Player can play in many Games and one Game would have many Players. The way to map this in Power Pivot is using a bridge table as we have done here using GameTeamPlayer. See model diagram above.
We are not even going to break a sweat solving this one. Been cracked a long time ago, thanks to thinkers like Jeffrey Wang and Gerhard Brueckl. Here is our new measure, see image below for explanation. Read the linked articles, if you really want to dive-in. For most folks I would recommend understanding this as a pattern using image below and simply applying it. (Never heard the “downhill” “uphill” terminology in relationships, for $10 buy the digital version of our PowerPivot DAX book. Would be the best $10 you ever spent).
You can see, in the Pivot image below, that the new *magic* measure is working and we have also added some basic additional measures to show games won, lost and win percentage.
The perfect Pair of Players
Metrics for individual players were relatively straightforward. But how do we write measures for pair of players? After some head scratching it became clear that I needed another Player table. I left the original Player table alone and added a PlayerA and PlayerB.
I created relationship between GameTeamPlayer and PlayerA/PlayerB, but as you can see by the dotted lines that both relationships are marked as inactive. You can do so from the Manage Relationship window. Why inactive? That will be apparent soon. For now just go with it.
To get us warmed up to the idea of inactive relationship, let us write the GamesPlayed version for the PlayerA and PlayerB tables as below. (Shown for PlayerA; PlayerB works in similar fashion).
PlayerAGamesPlayed :=CALCULATE (
USERELATIONSHIP ( GameTeamPlayer[PlayerID], PlayerA[PlayerID] )
You can see when we put PlayerA on the pivot, while the original GamesPlayed measure does not work (just repeats the same number), PlayerAGamesPlayed shows the correct number. USERELATIONSHIP in effect marks that relationship as Active, during the computation of this measure.
The Big Kahuna
For the final measure, we are going to repeat the *magic* many-to-many pattern above with some changes
Step A: We still need to refer to the “downhill” table: GameTeamPlayer.
But we need to refer to it in two versions
|GameTeamPlayer Version 1: Filtered using relationship with PlayerA|
|GameTeamPlayer Version 2: Filtered using relationship with PlayerB|
Step B: Next we need to combine these two version in an “intersect”, so we only count the games that they played together
Here is the formula (lines numbered to aid explanation):-
- PlayerABGamesTogether:=CALCULATE( [GamesCount]
- , CALCULATETABLE(GameTeamPlayer, USERELATIONSHIP(GameTeamPlayer[PlayerID], PlayerA[PlayerID]))
- , CALCULATETABLE(GameTeamPlayer, USERELATIONSHIP(GameTeamPlayer[PlayerID], PlayerB[PlayerID]))
Lines 2 & 3 accomplish Step A, giving us two filtered versions of the same table (remember Schrodinger’s Cats/Pivots?). Line 1 has the CALCULATE which accepts the two tables as parameters and combines them to get us the intersection. Note that we use our original measure GamesCount, which to remind you, is defined as below:-
So technically the “intersect” happens at the GameTeam table, where the arithmetic is being performed. That also explains why we left the relationships as inactive. Else, if you think about it, there is really no row on GameTeamPlayer where say Player=”Bad News” and Player is also “Band-Aid”. Don’t make that hurt your head too much. If you feel like you understand the overall pattern and can apply it, that’s all you need. I’ll admit my own understanding ebbs and flows.
And here is the resulting pivot:-
Let us quickly validate the result. We will do that for players “Melo” and “Mini Lebron” who have relatively few games thus making it easy for us to check the data.
In the image below, we can see the measure PlayerABGamesTogether, and using a detailed pivot can validate that indeed, there were three Game Teams where these two played together.
K-Real and Butch – burning up the court
So who is the dominant pair? Looking at data just for K-Real, the pairing of K-Real and Butch stands out. Of all the 164 games K-Real played, they were paired for 51 and have a winning percentage of 73%. Which is markedly higher than K-Real’s other pairings!
Butch happens to be O. Tre. Kirill, let me know if that sounds right? 🙂