Forecasts Anticipate Trends within Large Populations, On Timelines Typically Measured in Months or Years. Predictive Analytics Anticipate the Behavior of a Single Individual, Often in a "Right Now" Timeframe.

Forecasts Anticipate Trends within Large Populations, On Timelines Typically Measured in Months or Years.
Predictive Analytics Anticipate the Behavior of a Single Individual, Often on a “Right Now” Timeline.

We’ve Seen Some Confusion On This – and It’s Understandable

It may sound like silly semantics at first, making a distinction between two synonyms like “forecast” and “predict.”  And silly semantic differentiations ARE one of my personal pet peeves – Chris Rock’s famous distinction between “rich” and “wealthy” is one example that did NOT bother me, because he used it as a device to CREATE an important distinction.  But we’ve all been lectured at one time or another by someone taking two previously-interchangeable words and pretending that there has ALWAYS been a sharp distinction between them… in an effort to make themselves look much smarter than they are (or more importantly, smarter than their audience).

So I’m NOT here to rewrite the English language – “predict” and “forecast” ARE synonyms in the dictionary sense, and will remain so.  In fact their interchangeability in normal parlance is the REASON why companies end up frequently buying the wrong analytical product or service.

But in the analytics world, there ARE two very different kinds of activities hiding behind these synonyms, and it’s important not to buy one when you actually need the other.  You can end up five-to-six figures deep in the wrong thing before you discover the mistake.  I want to help you avoid that, hence this article.

The Picture IS Worth a Thousand Words

Once you’ve digested the illustration at the top of this article, yeah, you’ve kinda already got it.

  • Forecasting is when we anticipate the behavior of “Lots” of people (customers, typically) on “Long” timelines.
  • Predictive Analytics anticipate the behavior of One person (again, typically a customer) on a “Short” timeline.

So…  Macro versus Micro.

But let’s delve just a little bit deeper, in order to “cement” the concepts.  Examples will help, so let’s…

Pretend We Are Facebook!

Like many modern organizations, Facebook benefits from a mixture of both Macro and Micro.  They certainly have need to Forecast the trends in their overall business, as well as the need to Predict the behavior of individual users.  Let’s illustrate with some examples.

Examples of Forecasting Questions at Facebook

  • “How many active users will we have at the end of this year?”
  • “Three years from now, what % of our revenue will come from mobile devices as opposed to PC’s?”
  • “How much are we going to spend on server hardware next quarter?”

These are all quite valid and important questions to answer.  They impact strategic planning and guidance provided to investors – about as crucial as you can get, really.

They also all tend to blend human inputs with computational power to produce their answers.  We use data and software as extensions of our brains – very often in a collaborative effort across multiple people (sometimes a handful, sometimes hundreds or more).  In other words, these aren’t the sort of things we hand to machines to handle in a fully-automated manner (at least, not today).

Types of Activities in Forecasting

The simplest form of forecasting is the “lowly” line chart:

Forecasting is Simple When the Trend is Obvious

Line Charts are an EXCELLENT Forecasting Tool – As Long as the Trend is Smooth and Obvious

(Unless Some Big Structural Change is Coming, It’s Not Hard to Guess Where THIS Trend is Headed)

Of course, trends aren’t always so smooth and obvious.  Try this chart instead:

Choppy Trends Are More Common However, and Forecasting What Comes Next Then Requires Some Thought

“Choppy” Trend:  Not NEARLY as Obvious What the Next Few Months Will Look Like, So Now What?

When faced with a non-obvious trend (and most trends are NOT as obvious as the first example), here are some progressively-more-sophisticated approaches:

1) Add an automatic trendline to the chart.  This SOUNDS sophisticated, but the algorithms behind automatic trendlines are, by necessity, quite primitive in most cases.  EX: They can’t tell the difference between a one-time exception and a significant new development.  For instance, is August 2016 (the big spike in the red chart above) due to a one-time influx of revenue from a source that won’t surface ever again, or representative of a breakthrough in the core business?  Trendline algorithms aren’t typically sophisticated enough to tell the difference – and that’s a GOOD thing, because if they started making assumptions about such things, they’d be WILDLY wrong at times – without warning.

Here’s another way to say it:  when a trend is obvious, WE can draw the trendline with our eyes.  But traditional computer-generated trendlines aren’t much better.  They do basically the same thing that our eyeballs do – they “split the difference” mathematically amongst the known historical points in order to draw a line (or curve).

2) Build a smarter trendline with a combination of biz knowledge and formulas.  Ah, now we’re getting somewhere!  And we do this sort of thing ALL THE TIME with our clients, since DAX (the Excel-style formula language behind Power BI and Power Pivot) is an absolute godsend here.  Let’s factor in seasonality for instance – take how strong our biz has been this year versus the prior year, for instance, and then multiply that by the seasonal “weight” of January (calculated from our past years’ historical data) and we’ll arrive at a much more intelligent guess for January than our auto-trendline would have generated.

And, as humans, we may know quite a bit about that August 2016 spike.  Let’s say a bunch of that revenue came from a one-time licensing deal – something that we cannot responsibly expect to happen again.  Well, a quick usage of the CALCULATE function may be all that’s required – we’ll just remove licensing revenue from consideration while building our smarter trendline formula.

3) Decompose the overall trend into sub-trends.  Just like in my article on The Ten Things Power BI Can Do For You, (and the live presentation version on YouTube), DECOMPOSITION is an incredibly powerful weapon, and arguably a necessary one.  If you break your overall trend out by Region, for instance, it may be easier and more accurate to forecast each Region individually, and THEN re-combine all of those “sub-trends” to come up with a macro trend forecast.  Again, this is something we have done a lot of with our clients, and the Power BI / Power Pivot toolset is fantastic in this capacity.

4) Incorporate external variables.  We often get so caught up in our businesses as being Our Everything, so it’s good to take a step back and remember that the outside world has at LEAST as much influence on “what’s going to happen” as our internally-focused trends and behavior.  In our Facebook example, for instance, knowing whether China is likely to suddenly allow Facebook to operate there would be a big deal.  But in a more mathematical sense, even knowing how many children will be “coming of Internet age” in the next year is an important external variable, since they represent potential new Facebook users.  Similarly, the number of people in developing countries “coming online” as cellular and broadband infrastructure reaches them is also relevant.  You can take important factors like those (as well as expected conversion rates – what % of those new Internet users will become Facebook users?) and mathematically incorporate them into your forecasting models, especially a “sub-trend” model, to arrive at an ever-more-refined estimation of the future.

5) Involve the Humans!  Given the intricacies of the sub-trends approach, it makes sense to “subcontract” each sub-forecast to a specific person or team who knows it best, have them mathematically blend internal trends with external variables, and submit their forecasts back up to you.  This allows the formulas used to be quite different in each sub-segment, to reflect the very-different realities of each segment.

This may sound chaotic, and in larger orgs, it definitely CAN be.  But our friends at Power Planner have an AMAZING system that works in tandem with Power BI when it comes to collaborative forecasting and budgeting.  We’re using it right now to help several of the world’s largest corporations develop an ever-more-accurate picture of the future.

Wrapping Up Forecasting

I think that’s enough about Forecasting for such a high level article.  I want to move on to Predictive, and then a parting observation.

So let us know if you’d like to know more about Forecasting in Power BI – whether you want to perform that in a top-down fashion or bottom-up, we’ve got the experience.

Examples of Predictive Analytics at Facebook

Forecasting vs. Predictive Analytics - Another Illustration of Macro vs. Micro (Cuz Illustrations are Fun)

Forecasting = “How Heavy Will The Traffic Be Next Month?”  Predictive = “Will the Yellow Car Get Off at the Next Exit?”

Here are some Predictive Analytics examples that are critical at Facebook:

  • Of our millions of advertisements, which one should we show to user X right now?
  • When User X is logged on via a Mobile Device vs. a PC, should we change up our ad mix, or is their behavior consistent across platforms?
  • What’s the optimal ratio for “real posts” to ads in User X’s news feed (optimal in terms of ad clickthrough rates)?

The difference in these examples leaps off the page – we’ve “zoomed in” on a single user/customer, and we’re making a decision/adjustment in real-time (or near-real-time).

But before we dive a bit deeper into HOW you do this stuff, let’s address a common question we get at this point…

“OK, but can’t we take a bajillion ‘micro’ Predictions and roll them up into a ‘macro’ Forecast?”

It’s a fair question that smart people ask:  if you get really good at micro-level Predictive, can’t you dispense with macro-level Forecasting altogether?  Well, no.  Let’s drill down on that first example – the “which ad should we show User X right now?” example, to illustrate.

(As we answer this question, we actually start to answer the “How Does One DO Predictive?” question as a bonus!)

In order to predict which ad is most-likely to “succeed” with User X, your Predictive systems need to know some things:

1) Detailed attributes of User X are a Must-Have Input for Predictive.  The obvious stuff like age, gender, and location for starters.  But also…  their history of what they have Liked.  Articles they have clicked through to read more.  Ads they have clicked.  Their Friends Lists, and the Like/Ad behavior of them.

2) Historical behavior of similar users is a tremendous help for filling in gaps.  Let’s say User X has only been on Facebook for a week, and we don’t have really sufficient ad-click history on them yet.  But in that week, we’ve already seen quite a bit of them in terms of Like behavior and Article Clickthrough behavior.  Combine that one-week snapshot with their basic demographics, and we can quickly guess that they are quite similar to “Audience XYZ123” – a collection of users with similar Like and Clickthrough behavior, and now we can lean on THEIR longer-term histories of ad click behavior to get a good guess of what sorts of ads User X will click.

3) Oh yeah, we also need the detailed attributes of the Ad in question!  Is it video or static?  Who is the advertiser TRYING to reach?  What are the associated keywords?  In the time we’ve been running this Ad, who has been clicking on it and who has not?

3) Predictive Analytics systems NEED to make mistakes.  Predictive systems are constantly refining their guesses based on failure.  Let’s say we think our “one week old” User X is gonna love a particular Ad based on their striking similarity to Audience XYZ123, so we serve that ad up, and not only does User X not click the ad, they actually click the little X that says “never show me ads like this again.”  Does our Predictive System give up and stop making guesses for User X?  Heck no!  Instead it’s in some sense thrilled to get this feedback, and uses it to start developing a more-accurate picture of User X.  Over time our system may even “discover” that it should “fork” Audience XYZ123 into two splinter factions – one like User X, and one that more closely-resembles the original picture of Audience XYZ123.

In short, it’s worth summarizing and emphasizing…

  1. In order to make micro-level Predictions, you require a MASSIVE amount of detailed information, AND…
  2. You need to be able to make, and learn from, mistakes.

Predictive Analytics are Automated – Humans Have Little Role (Once the System is Running)

Predictive sHumans SET UP Predictive Analytics Systems, and Can TUNE Them, But They Aren't Part of the Onoing, Automated Prediction Processystems operate without direct human intervention.  This is in direct contrast to Forecasting, which, while highly mathematical and assisted by software, benefits GREATLY from direct human insight and input.  Sure, in a Predictive system, there might be some high level “knobs” that you can adjust to make the system more or less aggressive for instance, but each individual prediction happens without a human being consulted.

This is where Machine Learning, AI, and R shine.  Whereas DAX is fantastic at Forecasting, it really has no role to play in Predictive (except as an adviser in turning those high-level knobs mentioned in the prior paragraph).

This is why Microsoft has rapidly expanded its analytics offerings.  Rewind 24 months and Power BI is basically all you were hearing from Mount Redmond.  But in very short order, we now have HDInsight and Azure Machine Learning making some serious waves in the marketplace.

“Hey, you never answered the question: ‘if we master the micro, haven’t we ALSO mastered the macro?’”

Let’s start here:  how many chances do you get to be wrong in a Macro-Level Forecast?  One.  You get One.  “Q4 2017” only happens once, full stop.  Machine Learning systems need to experiment, fail, and improve.  But when it comes to Forecasts, YOU don’t get a chance to learn from mistakes – fail badly enough and you’re out of a job.  Assume that Q4 2017 is gonna look just like all of the Q4’s past at your own peril – but even if we DID do that, we wouldn’t need a sophisticated Predictive system to do so, would we?

Next, do you have all of the micro-level detail required?  Nope, not even close.  For starters, you’re gonna have new customers next quarter.  And you know NOTHING about them yet, not even the basics of their individual demographics.  And remember, I’m not talking about broad trends here, I’m talking about hyper-detailed information on each individual customer – THAT’S what you need for a Predictive system to function.  Oh, and perhaps even MORE importantly, you don’t know what Ads are going to be submitted by your advertisers next quarter (sticking to the Facebook example for a moment).

So, you can’t Predict your way to a Forecast, but also:  Forecasts need to be Explainable.  Imagine telling Wall Street that we expect 15% revenue growth next quarter.  They ask why.  And we say “um, our Machine Learning system said so, but we really don’t understand why.”  That’s not gonna fly, but that’s precisely the best you can do really.  Predictive systems are far superior to human beings at their task precisely because they are INhuman.  If they could explain their predictions to us and have us understand, we wouldn’t have needed them in the first place.

So here’s an, ahem, prediction from me:  Forecasting will remain a “humans assisted by DAX-style software” activity until we eventually develop General AI.  As in, the kind that eliminates ALL need for humans to work.  The kind we’re scared might just flat-out replace and discard us, or might usher in an era of unimaginable human thriving.  Either way, when we get to that point, our problems of today are no longer relevant.

In the meantime, we can still do AMAZING focused things with Predictive Analytics.  Not every organization has a need for it yet (or is ready for it yet, more accurately), but chances are good that your org DOES.  If you want help getting started with HDInsight and/or Azure Machine Learning and related technologies, let us know and we’ll be glad to help.

  Subscribe to PowerPivotPro!
X

Subscribe

Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology. 

This Post Has 11 Comments

  1. LOVE this! You have to read it in bits almost to process it… 🙂 This is such a fun example – this concept could also work for machine behavior and manufacturing trends I think …

    Thanks – this is great!

    1. Thank you Heather! These things always take MUCH longer to write than I expect (you’d think I’d have learned that by now, after 7+ years doing it), and it’s always nice to hear that people appreciate them 🙂

  2. The Data Mining Add-in that was launched with Excel 2007 (and works upto Excel 2016) has some very powerful “Forecasting” and “Predictive Analytics” tool sets.

    https://www.microsoft.com/en-us/download/details.aspx?id=35578

    The reason it is not as popular as the Power BI Tools (PQ+PP) is because it requires you have a a connection to a SQL Server DB and only the “results” are returned back to Excel.

    Microsoft also tried a Cloud version of these tools but later realized that it would be a very good idea so stopped it after a few months.

    There are several feature requests (including mine) to integrate the Data mining tool set in to Power BI Desktop – If it happens It will be a giant leap in the Power BI World bringing “Forecasting” and “Predictions” together

    1. I think this giant step you are talking about was alrdy given by the power bi team, when they incorporated R scripts in to power bi. I’m sure Python will be added in a near future aswell, since it will be added in next versions of SQL server. 🙂

  3. Wonderful post, great explanation of the differences between forecasting and predictive analytics. IMO, the blend of forecasting and predictive analytics is necessary for any business to be successful. I think some legacy companies are behind the curve on this, and that’s why we see “unicorn” startups rising quickly to billion dollar status.

  4. Hi Rob
    First of all great post :)! i like your blog a lot!.

    I’m running a company in Colombia trying to deploy agile BI solutions (using power bi and power pivot), but we’ve expirienced some issues because in my country companies haven’t invest in ERP software and the “data capture process” (the forms, and all the front end for collecting data from users) still very messy (usually every employee make it his way, in some excel file), i have tried to solve this “on-the-go” by making some standard excel formats, or using vba to make some user forms, but this is not ideal because it does not allow multiuser and is not scalable and robust. Without taking into account that most of my clients are not willing to pay for a software and expect, that i solved all of their problems that are related with data management. Have you experienced this issues to?, what would you recommend in such cases?

    Thank U!

    Miguel!

    1. Hi Miguel – quick question as I ponder… are companies in Colombia “scared” of the cloud, or do they embrace it? Because it sounds like a cloud-based system might be your best bet. Low cost and impact on each client, so they get to feel like you HAVE solved all their problems. But before I think about it some more… if they are nervous about submitting data into the cloud, this direction might not be viable.

      1. Hi again Rob

        Some of there are scared because the “small local bussines” out here are used to collect data in paper sheets. But this is not the real problem, the problem is that the solution for each costumer is so specific that we would have to develop a cloud-based system for each one of them. And this probably would be a whole different bussines or even though it would be the same bussines would not be “agile”.

        Thanks for your answer.

        Miguel

  5. Another valuable article Rob. Here’s a question that you may know the answer to:

    I recall at the last but one Data Insights Summit a presentation where the notion of democratising machine learning by making it available in Modern Excel was mentioned
    Do you know what the latest is on this?

    I think it could be incredibly valuable to start off at a more accessible / small scale level as Azure Machine learning might not be available and / or could scare some folk off

    In the same way ETL and data modelling have arrived in the hands of citizen data scientists I’m looking forward to seeing this stuff come out of the niche / specialist domain

    1. Anthony there’s so much going on at MS these days that I don’t have tabs on that particular question… but I should 🙂

      Separately, this represents a current frontier in my own professional wisdom. In 2006, if you’d asked me “can the development of OLAP models be made accessible to Excel power users,” I’d have said “probably not.” And then even in 2009, after I’d spent a couple years attempting to do precisely that (working on the Power Pivot project at MS), my answer would have been upgraded to “maybe” or “I hope so.” It wasn’t until early 2010 that my answer became a solid YES – and in fact, MORE than yes, because it wasn’t just EQUIVALENT power that we were putting in the hands of the analyst. It turned out to be BETTER than traditional OLAP.

      ETL has also turned out to be yes – although raw M *is* quite a bit more “alien” than DAX, Power Query offers a lot of friendly buttons for common transformations. They still need a lot more buttons IMO, because M will never be as approachable as DAX. So let’s call this one “mostly yes, still in progress.”

      Can ML be similarly brought to the “pivot and vlookup” crowd? Certainly seems promising. My biggest question is whether the “art” of it is too subtle, kinda like Statistics, where we can accidentally or willfully lead ourselves astray. (Of course, if our predictive systems continually fail to predict customer behavior, that will provide us humans an obvious feedback loop, so the danger doesn’t run too terribly deep does it?)

      So I’m cautiously optimistic.

  6. Hi Rob, its interesting that you are using bottom up approaches to reconciliate high-level forecasts(in hierarchy) with lower-level forecast. You should explore some other methods of hierarchical forecasting reconciliation as top-down and middle-out approaches. Given that you can use r with PowerBI now you should explore the thief package that helps you implement this approaches https://robjhyndman.com/hyndsight/thief/. This surely will be a nice tool to add to your toolset

Leave a Comment or Question