skip to Main Content

So How Many Servers Do I Need in My PowerPivot SharePoint Farm?

As people increasingly move up from just dabbling with the addin, and decide to start leveraging the publish/schedule/share/secure benefits of the PowerPivot for SharePoint infrastructure, I am getting this question more frequently.

How many servers?  How much RAM?  And less often but just as important…  how many CPU’s?

Time for an old joke:  “Ever hear the one about the statistician who drowned in a river that was, on average, only 3 inches deep?”

That’s the whole joke.  And as time goes on, I only find it funnier.  And funnier.  But first, some basics.

The Three Primary Server Roles

For the vast majority of PowerPivot deployments, you will mostly need to concern yourself with three server types:  Excel Services, PowerPivot Engine, and Data Source DB Servers:

PowerPivot Server Roles Summary For RAM and CPU Planning 
(Click for Larger Version)

A few notes on the diagram up front:

  1. I recommend viewing the larger version – there is information on the diagram (in the notes) that I will not repeat below.
  2. Don’t read too much into the 3/4/2 ratio of Excel Services/PowerPivot/SQL – I included different numbers of servers in each tier intentionally, to illustrate that you can scale each tier out at independently.  But, for instance, I do NOT expect you will need 3 Excel Services boxes for every PowerPivot box (you will likely need less).
  3. Although separated above, server roles CAN be combined onto single servers – for instance, many folks can get by with an “all in one” server where everything above lives on a single box.  Also, even in a multi-server farm, one of the MS-recommended configs is to combine Excel Services and PowerPivot onto a single box, and then deploy as many of those combined boxes as needed.
  4. There are some elements missing from the diagram – for instance, there is a Web Front End (WFE) role, and a SharePoint Configuration Database role, but in my experience, PowerPivot does not put unique strain on those elements (except for config db disk space).  Your mileage may vary of course.

Early Planning Efforts

Back in the Spring, I set down to the very serious task of “how much hardware will we need on our SharePoint servers in order to handle user loads?”  And naturally, I made a very serious spreadsheet to model it out:

PowerPivot SharePoint Hardware Planning Spreadsheet v1

Wow, what a spreadsheet.  I mean, I even used the =POISSON.DIST() & =BINOM.DIST() functions – clearly, such spreadsheet horsepower indicates accurate results!

Turns out, that spreadsheet was nothing more than an amusing theoretical exercise with little bearing on the real world.

There were multiple problems with that spreadsheet:

  1. It relied on outright guesses as to how often/how intensely consumers would utilize the published reports.
  2. I was not yet aware of the CPU-gobbling power of slicers, a power that makes real-world queries many times more CPU-intensive.
  3. Most importantly, it assumed peak usage would occur in the morning, when consumers are most in need of fresh information.

Solution for 1) and 2)

These are in some sense the simplest to address.  Quite simply, put up a pilot solution and observe the usage characteristics.  You will learn a lot about usage patterns as well as what that does to CPU and RAM.

But even better:  you may choose to simply ignore these factors for now and focus on problem 3 instead.  Here is why:

Peak Load is Probably Experienced During Scheduled Refresh

If you’ve looked at the larger version of the diagram above, you have seen a hint as to what I’m about to say here: 

At Pivotstream, our nightly refresh process puts far more strain on our servers than our users do, and is what we now plan our hardware around.

I will explain further, so you can evaluate whether your situation will be similar.  We receive new data on a nightly basis (typically late at night).  That data gets ingested into SQL server, and then our PowerPivot refresh process begins.

Once PowerPivot refresh begins, we have about 6 hours, tops, to get all models and reports refreshed, so that when business opens in the morning, everyone has access to fresh insights.  6 hours sounds like a lot…  until you discover that it isn’t.

A lot happens during scheduled refresh!

Keep in mind that the PowerPivot model (the embedded database living inside your PowerPivot workbooks) must first be refreshed – let’s call this phase “Model Refresh.”  This can put a lot of strain on your SQL servers just in terms of supplying the data, since PowerPivot v1 pulls a fresh copy of all tables (no incremental refresh).

As the data flows into the PowerPivot servers, a LOT of CPU power goes into compressing that data into the storage format.  During the refresh process, RAM usage steadily climbs as well.

As model refresh nears completion, RAM usage spikes upward quite a bit, sometimes by as much as 50-100% the size of the resulting workbook.  CPU usage also spikes during this phase.

Once the model is done refreshing, PowerPivot for SharePoint then triggers a refresh (via Excel Services) of the Pivots and Cube Formulas in the workbook, so that when thumbnail screenshots are “snapshotted” for the report gallery, those reflect the latest data (let’s call this phase “Pivot Refresh”). 

Yes, that process initiates via Excel Services, but that refresh immediately results in a meaty set of queries sent back to the PowerPivot servers for processing.  So, CPU usage spikes again.  And the model in question is held in RAM while that happens, so that RAM can’t be recycled into the pool to be used for model refresh.

Remember, this all happens for a single workbook.  And if you hang a bunch of report-only workbooks off of a single “hub” model workbook (as we do all the time), the amount of time it takes to finish Pivot Refresh can actually exceed the time for Model Refresh.

If you have to cram all of this into a narrow nightly window, chances are that you will need more hardware to pull it off than you ever will need for normal daily usage!

Rough Guidelines for Your Own Situation

This post is running a bit long, so I will try to be succinct in closing:

  1. If a significant percentage of your PowerPivot models and reports will be refreshed nightly, refresh is likely going to be a larger peak strain than normal interactive usage
  2. Even if you only refresh weekly (or less often), if the execution window between “data is ready” and “reports must be ready” is basically still a single night, then refresh is still likely going to drive your peak hardware need.
  3. If you think “Peak Load = Refresh Process” is likely the case for your organization, I suggest ignoring interactive usage projections during your hardware planning process, and instead developing some prototype workbooks (with rich data sets and mutliple report sheets with 100% realistic slicer sets, measure complexity, etc.), and then putting those workbooks through the refresh process to get a baseline.

Rob Collie

One of the original engineering leaders behind Power BI and Power Pivot during his 14-year career at Microsoft, Rob Collie founded a consulting company in 2013 that is 100% devoted to “the new way forward” made possible by Power BI and its related technologies. Since 2013, PowerPivotPro has rapidly grown to become the leading firm in the industry, pioneering an agile, results-first methodology never before seen in the Business Intelligence space. A sought-after public speaker and author of the #1-selling Power BI book, Rob and his team would like to help you revolutionize your business and your career.

This Post Has 9 Comments
  1. Rob,

    A week or two ago you posted something on Twitter that you felt that MS should be focussed on nightly refresh. Is this why?

    I’m did a pretty rudimentary test with PowerPivot for Excel… I’ve never monitored the refresh process and would expect the results are similar. Now I just eyeballed this through the resource monitor, so lots of room for interpretation but…

    Kicking off the “Refresh All” from PowerPivot causes the SQL server proc to jump immediately. I saw numbers from 10% to 38% of CPU, with the majority in the 15%-20% range. There was similar activity on my local machine as the data was refreshing. What was really interesting to me though is that the only RAM spike I saw was a VERY brief one on the client.

    I can see, based on this, that IT might be a little concerned if you had lots of power users refreshing models at will against the SQL data store. In those cases it may be preferable to push for scheduled refresh nightly to alleviate the load on the server. On the other hand, if you have a dedicated SQL server for PowerPivot refreshes, or an environment where the CPU usage isn’t routinely intensive, you may be able to get away with infrequent contention. I guess it depends on how big/active your BI infrastructure is, and how much it draws on the SQL servers…

      1. Fair enough. Personally though, if I were going to put my money into developing new features, it would be allowing the Excel pros to set up their scheduled refresh, which they currently can’t do without Sharepoint.
        Whether it be a flag to “Refresh on open”, like you can set on PivotTables, or VBA, something is needed on the non-Sharepoint side. Personally I think VBA would be a more robust investment, as then we can at least control the refresh to some degree if we need to, (or even dissallow it during certain time periods,) but honestly, both features should be implemented.

        Once I had that nailed down, THEN I’d focus on the efficiency. My reasoning for this is that you can pour huge amounts of labour into the efficiency pool forever trying to eek out an extra millisecond of performance, and nothing will ever be good enough. Consuming those resources at the expense of a critical feature does not make sense to me.

  2. The only reluctance on the VBA front, I think, is that the client is free.

    MS decided to give away the benefit of PowerPivot for Excel, which is a pretty big thing to give away. But certain features were reserved for the server, which is not free. When Amir first said they were planning a VBA method for local refresh, I was pretty surprised and skeptical.

    They have since gone quiet on that topic. I have not asked anyone at MS about it, but my suspicion is that they have reverted to reserving refresh for the server.

  3. Oh, and by the way, my original tweet was aimed at people deploying PowerPivot, not MS. My point was that people deploying it should focus their own hardware planning around refresh rather than interactive usage, but I left the “who” ambiguous so I see why you interpreted that to be aimed at MS.

  4. Is there a site or can you give me an estimation of the cost voor a test environment….?
    I was thinking about a all in one box with sharepoint with about 25 users.
    So I can promote the powerpivot sharepoint combo to my management….

      1. By test environment I meant a sharepoint installation with. 25 users
        We have all the licences for sql etc as we are a big compagny world wide
        Using servers outside our own compagny is out of the question.
        So I need to know how much sharpoint would cost and what are the costs per user

Leave a Comment or Question