Something About This Reminded Me of Fraser Crane Running With Scissors

Something About This Reminds Me of Fraser Crane Running With Scissors
(Click for YouTube)

Stand Back…

Remember, I am NOT a statistician.  I slept through Statistics in college – and I mean in my dorm room bed, not even in the lecture hall.

I tend to learn via doing, and via teaching, so today’s post is a “forcing function” by which I attempt to stretch my brain and skill-set.

There’s an excellent chance, therefore, that I will do something wrong here.  In fact I’m hoping many of you more stats-minded people will chime in here and correct/extend this where appropriate.

What the Heck IS a Linear Regression, Anyway?

Linear Trendline in Excel

A linear trendline in an Excel chart is an example of linear regression

Commonly, people talk about linear regressions as a means of evaluating correlation and impact.  For instance, does a better defense make for a better football team?  In other words, they are often used in binary contexts – cases where people are looking for a yes/no answer.

But in the simplest case, they are NOT a binary yes/no technique, but instead something that returns a number (or a set of numbers).  Not “yes/no,” but “how much,” in other words.  It returns a line, like the green one pictured in the chart above.  A line doesn’t say yes or no does it?  (Well sorta.  More on this later).

Furthermore, if you believe you have a good linear regression model, you can use it in a predictive capacity – given one variable, you can get a decent guess at an outcome.  In the chart above, for instance, if we had an X value of 6, the linear regression “predicts” that Y would be 20.

A “Big and Beautiful” Example – Pharma Sales

Big and Beautiful:  A Man Who Should Not Exist.

Me (left) with my cousin, aka “Big and Beautiful,” on the right.
One of us has performed a linear regression before today, and it wasn’t me.

I have a cousin who is basically a living, breathing dose of humility.  Whenever I start to think of myself as special, all I have to do is think of “Billy” (not his real first name, and his last name is NOT Collie, so don’t bother looking him up), and immediately I am restored to a more “grounded” state.

He’s younger and taller than me, obviously, but that’s just the beginning.  Yes he’s much better looking (some of his friends call him BAB – Big and Beautiful).  He’s super athletic, too, having played major college football.  Super charismatic and friendly.  Also incredibly HUMBLE despite all of that.  He’s an antidote to inflated self-images everywhere.

So it’s perfectly natural that a pharmaceutical company would hire him as a sales rep, right?  Bingo.  He has “presence.”  It would be hard NOT to notice him walking into a physician’s office.  I don’t care what you are selling, people are hardly immune to such things.

He visits dozens of doctors’ offices each week, and the impact is supposed to be, “Talk to Billy.  Bask in his glory.  Then go prescribe more of Billy’s drugs.”

But here’s where the story takes a twist.  Billy is not just a pretty face.  I’ll never forget that little bastard, when he was NINE YEARS OLD, absolutely dismantling me in a game of chess… when I was in COLLEGE.  Just demolished me.  And then proceeded to explain, CORRECTLY, what was flawed in my overall style of play.  I have a long memory, Billy, and that left an impression.  I’m on to you.

Big and Beautiful Asks:  “Do I Matter?

imageNot content to simply be the “presence” he was hired to be, after a year or so, Billy started wondering – “do my visits to all these doctors’ offices really have an impact?”

So, like any former football player, he conducted a linear regression analysis of his own impact.  Pretty cliché, huh?  Sheesh.  Yeah, he’s in the Hugh Millen Zone.

Doable in Normal Excel, Yes.  But…

Yes, I know Excel has lots of tools for this linear regression stuff.  Chart trendlines are just one example – functions like LINEST() and TREND() let you do similar things in pure formulas, too.

But all of that lacks Power Pivot’s elegant flexibility – you can’t quickly modify an “old school” Excel analysis like you can in Power Pivot.  Slice it, filter it, sort it, rearrange it, whatever – in Power Pivot, that’s all effortless, whereas in Excel alone, those all require “re-builds” of your original analysis.

So let’s try something that is uniquely Power Pivot, shall we?  (Say yes).

Data Set and “Normal” Power Pivot Analysis

I invented a fake set of data, simulating Billy’s conundrum:

Our Data Set for Linear Regression in Power Pivot and DAX

Four tables in our data set

OK, now I add some simple SUM() measures, which show me some mixed results:

No Linear Regression Yet, Just Simple SUM Measures

Simple Comparison, By Year and By Doctor, of Prescriptions and Visits
Pretty Inconclusive With Just the Eyeball Test

But hey, this is PowerPivot, so it’s easy for us to “condense” this down to “Change vs. Prior Year” and make things easier on our eyeballs.  Let’s add those measures, starting with the Prior Year versions:

[Prior Year Visits] =

   CALCULATE([Number of Visits],

[Prior Year Prescribed Qty] =

   CALCULATE([Prescribed Qty], DATEADD(Calendar[Date],-1,Year))

and then the “% Change versus Prior Year” versions:

[% CHG Prescriptions vs PY] =

   DIVIDE([Prescribed Qty]-[Prior Year Prescribed Qty],
          [Prior Year Prescribed Qty]         )

[% CHG Visits vs PY] =

   DIVIDE([Number of Visits]-[Prior Year Visits],
          [Prior Year Visits])


Percent Change Measures

This is a Little Easier on the Eye, But Still Pretty Noisy
(And we have 20 total doctors in the data set, but only showing 10 above)

Gosh, this is running long…

…and I was sick yesterday to boot, so I don’t want to rush this.

But let me give you a preview of the results.  I’ve written a bunch of new measures:

Linear Regression Slope - Produced in Power Pivot

The Slope of the Line is –0.4%, Indicating that Every 1% Increase in Visits
Results in a 0.4% LOSS of Prescriptions.

But, let’s look at our fake Doctors table, and notice that there is an “OrgType” column:


Interesting…  I Wonder…  Do Visits Impact Different Org Types Differently?

So let’s just slap that sucker on a Slicer, shall we?

Linear Regression Slope - Produced in Power Pivot and Sliceable Instantly

WHOA!  Doctors who work at HMO’s positively LOVE being visited!  A 1% rise in visit frequency results in a nearly 13% rise in prescriptions!  (Remember, this is FAKE data, and randomly-generated data at that).

Linear Regression Slope - Produced in Power Pivot and Sliceable Instantly

Conversely, Independent Doctors HATE the sight of Billy.  He makes them sick (yeah, I feel ya, independent doctors.  He is tough to take, all that perfection and all.  Just remember he has feelings too OK?)

This Just In:  Power Pivot is AWESOME!!!

Check that OUT!  Let’s say we had slogged through this analysis in “regular” Excel.  We’d come away saying that visits are counter-productive.  We would fire Billy.

Would we have had the energy to repeat this analysis broken out by HMO vs. Independent?  Heck no, not in regular Excel.  Too much work.

And something important – CRUCIAL, actually – would have been missed.  Don’t fire Billy…  have him focus on HMO’s!

But in Power Pivot, that insight is one click away.  Even I am a bit awestruck at the moment.  I love my job. 

Imagine now if we had a column/table for Sales Reps!  We could analyze their individual impacts.  Maybe someone else is REALLY GOOD at influencing Independent doctors.  And if we compared that person’s approach to Billy’s, we’d discover the magic behaviors that make the difference.  We could change the entire business of subtly influencing doctors to peddle our drugs.  (Hmm, when we say it THAT way, it sounds sinister.  Which, um…  no comment.)

Come back tomorrow (Thursday) for an explanation of how I got the slope stuff, and a chance to download the workbook.

Click here for part 2 >>

  Subscribe to PowerPivotPro!


Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology. 

This Post Has 3 Comments

  1. Let me be first to comment on my own post, and remark that I am starting to wonder about the “units” of the Slope measure. It might be that the Slope represents our expected Prescriptions percentage change that we’d get from increasing our Visits by 100% (aka, 1) rather than by 1%. Will have to noodle on this a bit 🙂

  2. Extensible: When I learned about this concept in java programming I was blown away. Power Pivot gives that same shock and awe. When a customer says to my first pass, “Yeah, but, I’d like to see it by this product line instead of channel, and I would like to be able to filter on accounting classification, oh and can we do this by fiscal year?”, I LOVE their reaction when I drag a couple of boxes, add a slicer, and ask, “Like that?” Heck, it’s a great discovery tool for myself, too.

Leave a Comment or Question