“Go. Go find the balance… Banzai, Daniel-san! Banzai!”
So… Are We Data Scientists or What?
On Tuesday I introduced the notion of a Data Scientist – it’s a hot new field, there’s a huge shortage of qualified people, and maybe PowerPivot gives us a shot at some of the action.
So… are we Data Scientists? Are we allowed in their hip new club?
Being Careful: Things We Are Not
In our quest for broader horizons and appreciation, Excel Pros need to be careful – if we overdo it, we may look silly. So first let’s cover a few things that we don’t do, and make sure we don’t advertise ourselves as such – because to many people, “Data Scientist” implies these things.
“Bad” News #1: We are not machine learning specialists
We don’t write software that teaches robots how to clean houses, or that allows a helicopter to fly itself. In an article titled “Why becoming a data scientist might be easier than you think,” the centerpiece is a video by a Stanford professor who is doing precisely those sort of things:
This Course Sounds Amazing. And Way Over My Head.
“Bad” News #2: We are not “real-time recommender” programmers
Related to above: The term “data scientist” sprung up out of companies like Amazon, Google, LinkedIn, and Facebook – companies that have a strong interest in analyzing the behavior of individual people and using that analysis, in real-time, to provide the user with a different experience.
“Pages most likely to answer your Google search,” “People you may know” and “Other items that may interest you” – good features for sure. And it’s not like a PowerPivot Pro could do that for you.
But if you’re a data scientist at one of those kinds of companies, you are likely to be doing that kind of work. So don’t walk up to Google and say “hire me, I’m a data scientist because I know PowerPivot.”
Bad News #3: We Are Not Sentiment Analysts
Standard problem for brands: what is Social Media saying about my product?
Imagine scanning millions of tweets and Facebook posts trying to determine if people are saying positive or negative things about my product.
Are these people saying good things or bad things about these products?
Can you write a formula? Me neither.
“Bad” News #4: We are not linear regression / statistics gods
Listen, my college statistics course was at the 8 am time slot in the dead of a particularly cold Nashville winter, and I was an underachieving “who needs the class when you have the book” smart aleck from Florida. How often do you think I attended that class? (Note that I also didn’t like learning from books, but was unaware of the irony in my position. I became more disciplined as the years wore on, but it was a slow process).
In all seriousness, even if I had been an eager attendee of that course, I still don’t think that would have changed much. For me, the following chart is enough to show me that good weather probably has a negative impact on sales of cold medicine:
Units Sold %Change vs. Prior Year (Top)
Compared to Weather Change vs. Prior Year (Bottom)
This would never, ever be acceptable to a Statistics PhD, nor would it constitute proof in a true scientific context. It falls under the headings of Good Enough and Probably Significant, which is the reality most of us deal with.
GOOD News #1: But there is always a spectrum!
Not All Data Needs are Like That!
But enough about the things we are NOT, let’s focus on why I think we do have a legitimate claim on the Data Scientist title, as long as we don’t oversell it.
Let’s return to that table from the prior post:
The term “Data Scientist” is still new enough that most people have never heard it,
but we’re ALREADY in the process of subdividing it!
I don’t think the table above shows that Data Science is a sham, nor do I think it’s silly of the Constellation folks to create it. Quite the opposite, actually – I think it reflects a lot more wisdom than most people possess.
Here’s a diagram I’ve been using for a long time now:
Traditional BI: It Just Isn’t Used as Broadly as Excel, Not by a Longshot
(Don’t think of this as a bell curve, think of it as a mountain,
with BI representing Peak Sophistication)
Well I think Data Science is going to shape up the same way:
My Prediction for Data Science
(Same Disclaimer: Think “Mountain” and “Peak Sophistication” not “Bell Curve”)
In other words, the bright shining center of Data Science is the Silicon Valley, Google/Facebook/Twitter, Type I stuff. OK.
But don’t you FEEL it? The world is changing beneath our feet! Data is everywhere. Data is HOT. People are waking up to the need to be smart – the easy money is all made, we’re into the “hard” stuff now. Being smart is becoming increasingly critical.
The mere EXISTENCE of a term called Data Science, and that it has gained so much traction so quickly, is HUGE news for us! The fact that all of the articles about it emphasize business acumen, curiosity, and liking data – also HUGE for us! That we are simultaneously being handed a tool set with dramatically expanded capabilities couldn’t be a more fortunate case of timing, too.
A Series of Questions That Illustrate What I Mean
Does every company drowning in data have Facebook-style problems? Heck no! Even mid-size business are absolutely awash in data, and they are waking up to how much they are leaving on the table. They are waking up to how much value there is in being smart.
Are there enough PhD data scientists in the world to address all of the world’s data problems? Heck no! And there never will be, just like there were never enough BI pros.
Is the term “Data Science” capturing people’s imaginations because everyone has sentiment analysis or machine learning problems? Heck no! From a business owner’s perspective, I think the “hotness” of the term owes to the intersection of the following psychological factors:
- “I am now realizing the potential value of data, of being smart”
- “Data Science is a term I can understand, that sounds like it can help me, and doesn’t scare me.” (“Business Intelligence” was always too cold and “standoffish” for its own good).
- “It doesn’t sound like yesterday’s stuff. It sounds new and magical. It does not sound like a stuffy and dusty office full of archaic spreadsheets. It sounds like alchemy. It sounds nimble and sharp.”
And PowerPivot very much lives up to #3
If a company has a “real” Data Scientist, does that mean PowerPivot Pros have no role to play? Heck no! Can you imagine the beautiful things we can do cooperatively? Example:
- Data Scientist preps the data in ways I cannot.
- I build metrics, reports and analyses at the speed of thought, against that data, and identify theories, hunches and patterns.
- Data Scientist then runs sophisticated, targeted analyses on those hypotheses, validates or rejects them, and then perhaps implements algorithms that allow the business to leverage those findings in real time, to optimize efficiency.
Is it better to be 100% correct and 100x slower? Heck no! Being 90% correct in 1% of the elapsed time will always be very, VERY valuable. And we are very good at that.
Is PowerPivot a Static Entity? Will We Never Get New Capabilities? Heck no! My former colleagues in Redmond are not ones to sit idle. They know they are onto something big with the PowerPivot, Excel-Pro-Focused approach. And they are going to constantly chip away at the ivory tower on our behalf.
(Don’t worry about the ivory tower though – it’s a renewable resource and always keeps getting taller, even as we chip away at it).