Something About This Reminds Me of Fraser Crane Running With Scissors
(Click for YouTube)
Remember, I am NOT a statistician. I slept through Statistics in college – and I mean in my dorm room bed, not even in the lecture hall.
I tend to learn via doing, and via teaching, so today’s post is a “forcing function” by which I attempt to stretch my brain and skill-set.
There’s an excellent chance, therefore, that I will do something wrong here. In fact I’m hoping many of you more stats-minded people will chime in here and correct/extend this where appropriate.
What the Heck IS a Linear Regression, Anyway?
A linear trendline in an Excel chart is an example of linear regression
Commonly, people talk about linear regressions as a means of evaluating correlation and impact. For instance, does a better defense make for a better football team? In other words, they are often used in binary contexts – cases where people are looking for a yes/no answer.
But in the simplest case, they are NOT a binary yes/no technique, but instead something that returns a number (or a set of numbers). Not “yes/no,” but “how much,” in other words. It returns a line, like the green one pictured in the chart above. A line doesn’t say yes or no does it? (Well sorta. More on this later).
Furthermore, if you believe you have a good linear regression model, you can use it in a predictive capacity – given one variable, you can get a decent guess at an outcome. In the chart above, for instance, if we had an X value of 6, the linear regression “predicts” that Y would be 20.
A “Big and Beautiful” Example – Pharma Sales
Me (left) with my cousin, aka “Big and Beautiful,” on the right.
One of us has performed a linear regression before today, and it wasn’t me.
I have a cousin who is basically a living, breathing dose of humility. Whenever I start to think of myself as special, all I have to do is think of “Billy” (not his real first name, and his last name is NOT Collie, so don’t bother looking him up), and immediately I am restored to a more “grounded” state.
He’s younger and taller than me, obviously, but that’s just the beginning. Yes he’s much better looking (some of his friends call him BAB – Big and Beautiful). He’s super athletic, too, having played major college football. Super charismatic and friendly. Also incredibly HUMBLE despite all of that. He’s an antidote to inflated self-images everywhere.
So it’s perfectly natural that a pharmaceutical company would hire him as a sales rep, right? Bingo. He has “presence.” It would be hard NOT to notice him walking into a physician’s office. I don’t care what you are selling, people are hardly immune to such things.
He visits dozens of doctors’ offices each week, and the impact is supposed to be, “Talk to Billy. Bask in his glory. Then go prescribe more of Billy’s drugs.”
But here’s where the story takes a twist. Billy is not just a pretty face. I’ll never forget that little bastard, when he was NINE YEARS OLD, absolutely dismantling me in a game of chess… when I was in COLLEGE. Just demolished me. And then proceeded to explain, CORRECTLY, what was flawed in my overall style of play. I have a long memory, Billy, and that left an impression. I’m on to you.
Big and Beautiful Asks: “Do I Matter?”
So, like any former football player, he conducted a linear regression analysis of his own impact. Pretty cliché, huh? Sheesh. Yeah, he’s in the Hugh Millen Zone.
Doable in Normal Excel, Yes. But…
Yes, I know Excel has lots of tools for this linear regression stuff. Chart trendlines are just one example – functions like LINEST() and TREND() let you do similar things in pure formulas, too.
But all of that lacks Power Pivot’s elegant flexibility – you can’t quickly modify an “old school” Excel analysis like you can in Power Pivot. Slice it, filter it, sort it, rearrange it, whatever – in Power Pivot, that’s all effortless, whereas in Excel alone, those all require “re-builds” of your original analysis.
So let’s try something that is uniquely Power Pivot, shall we? (Say yes).
Data Set and “Normal” Power Pivot Analysis
I invented a fake set of data, simulating Billy’s conundrum:
Four tables in our data set
OK, now I add some simple SUM() measures, which show me some mixed results:
Simple Comparison, By Year and By Doctor, of Prescriptions and Visits
Pretty Inconclusive With Just the Eyeball Test
But hey, this is PowerPivot, so it’s easy for us to “condense” this down to “Change vs. Prior Year” and make things easier on our eyeballs. Let’s add those measures, starting with the Prior Year versions:
[Prior Year Visits] =
CALCULATE([Number of Visits],
[Prior Year Prescribed Qty] =
CALCULATE([Prescribed Qty], DATEADD(Calendar[Date],-1,Year))
and then the “% Change versus Prior Year” versions:
[% CHG Prescriptions vs PY] =
DIVIDE([Prescribed Qty]-[Prior Year Prescribed Qty],
[Prior Year Prescribed Qty]
[% CHG Visits vs PY] =
DIVIDE([Number of Visits]-[Prior Year Visits],
[Prior Year Visits])
This is a Little Easier on the Eye, But Still Pretty Noisy
(And we have 20 total doctors in the data set, but only showing 10 above)
Gosh, this is running long…
…and I was sick yesterday to boot, so I don’t want to rush this.
But let me give you a preview of the results. I’ve written a bunch of new measures:
The Slope of the Line is –0.4%, Indicating that Every 1% Increase in Visits
Results in a 0.4% LOSS of Prescriptions.
But, let’s look at our fake Doctors table, and notice that there is an “OrgType” column:
Interesting… I Wonder… Do Visits Impact Different Org Types Differently?
So let’s just slap that sucker on a Slicer, shall we?
WHOA! Doctors who work at HMO’s positively LOVE being visited! A 1% rise in visit frequency results in a nearly 13% rise in prescriptions! (Remember, this is FAKE data, and randomly-generated data at that).
Conversely, Independent Doctors HATE the sight of Billy. He makes them sick (yeah, I feel ya, independent doctors. He is tough to take, all that perfection and all. Just remember he has feelings too OK?)
This Just In: Power Pivot is AWESOME!!!
Check that OUT! Let’s say we had slogged through this analysis in “regular” Excel. We’d come away saying that visits are counter-productive. We would fire Billy.
Would we have had the energy to repeat this analysis broken out by HMO vs. Independent? Heck no, not in regular Excel. Too much work.
And something important – CRUCIAL, actually – would have been missed. Don’t fire Billy… have him focus on HMO’s!
But in Power Pivot, that insight is one click away. Even I am a bit awestruck at the moment. I love my job.
Imagine now if we had a column/table for Sales Reps! We could analyze their individual impacts. Maybe someone else is REALLY GOOD at influencing Independent doctors. And if we compared that person’s approach to Billy’s, we’d discover the magic behaviors that make the difference. We could change the entire business of subtly influencing doctors to peddle our drugs. (Hmm, when we say it THAT way, it sounds sinister. Which, um… no comment.)
Come back tomorrow (Thursday) for an explanation of how I got the slope stuff, and a chance to download the workbook.