I answered some forum questions recently that were similar in nature and I thought they would make a good blog article. The scenario is this: there is a table that contains information about when an employee was absent from work like shown below.

image

The question is “How many working days were they off work giving consideration to both weekends and public holidays?”.

The trick to solving this problem is to create a good quality calendar table and use the calendar table to validate the days off.  I explain the process below.

Find a Good Starting Calendar

I decided to resurrect an article with a Power Query Calendar table that I wrote 18 months ago here: https://powerpivotpro.com/2015/02/create-a-custom-calendar-in-power-query/

I have simply cut the code from this previous article above and pasted it into a new blank Power Query as the starting point for this post today.  If you want to start from the beginning, go back and read the article above first so you learn the steps to get to my starting point.  Here is a sample of what this calendar looks like as is.  Note the first thing I then did was to apply a filter to remove the year 2015 (which I don’t need).

image

I then wanted to enhance this calendar so it contained a column that indicates “working days”.  In my demo example, a working day is any day Monday to Friday that is not a public holiday.  So that implies I needed to do 2 things. First I needed to create a weekday column and secondly I needed to load up the Public Holidays.

Weekday Column

The weekday column was easy. I simply went to the last step in Power Query, added a new custom column, and added code as shown below (remember PQL code is case sensitive).

image

This added a new column that returns 1 for weekdays and 0 for weekends.  The reason I selected 1 and zero is that it is then easy to simply add up the values in this column to work out how many working days there are.

Grab the Public Holidays

The next step is to add the public holidays.  Now I didn’t want to have to do this manually, so I went to my trusty friend Google and found the information I needed online (for NSW Australia).  Of course you would need to find your own suitable source.

http://www.industrialrelations.nsw.gov.au/oirwww/NSW_public_holidays/NSW_Public_Holidays.page

image

I created a new query\From Web Page, and Power Query gave me the following

image

I promoted the first row to headers, and then I had to merge the date and the year columns so I had a full date column I could use.  Once I had a text representation of the full date, I was able to convert this merged column into a Date format.

image

I then un-pivoted the columns to get this.

image

And finally I selected “Close and Load to”, “only create connection”.

image

The next step was to join this new Public Holiday table to my Calendar table.  This was quite simple – within Power Query I right clicked on the Calendar table query and then selected “Merge”.

image

I merged the 2 queries on the Date column using a Left Outer Join (the default as shown in 1 below).

image

Note that there are only 10 rows that match.  This is because my Public Holidays are for the years 2016 – 2018, yet my calendar table is designed to show up to today’s date only.  Into the future in 2017/18, the query will find additional matches and it will just work.

After merging the 2 queries, I then expanded the new column (shown as 1 below)

image

I made sure that I only selected the holiday name (2 above) and deselected the “use original column name as prefix” (shown as 3 above).

image

As you can see above, my calendar at this point has the names of the holidays as a new column.  All I had to do was write a new custom column that concatenates the weekday information with the public holiday information to give me a “working day” column.

Note how I used spacing, line breaks and brackets in the formula window below to make the code more readable.  It will work without doing this, but good luck trying to write and debug the formulas 🙂

image

As you can see below, this new column returns the 1 from the Weekday column except if there is a public holiday, then it returns 0.

image

The last step was to change the data types of these columns to be “whole number” rather than “Any”.  Once I had loaded these 2 tables into Power Pivot, I had this.

image

Importantly, I have not joined the 2 tables with a relationship.  A relationship can’t work in this scenario because there is a range of dates in the Absence Table (from date, to date) rather than a list of dates.  Given there is no relationship, it is necessary to write a DAX formula that looks at the Absence Table then filters the Calendar table so that only the days between the First Day Off and the Return Date are showing. Then add up the Working Day column to work out how many days off.

Once I placed the Days Away formula in a Pivot Table, it looks like shown below, and it correctly handles the weekends and public holidays.

image

Here is the formula I wrote – it looks complex but let me explain how it works.

image

  • Calculate always executes the filter portion of the formula first (lines 4 through 8 in this  case).
  • The FILTER function is an iterator.  It iterates over the Calendar table (line 5) one row at a time.  There are no filters on the Calendar table, so there is no need for an ALL(Calendar) in this case.  As FILTER iterates over the table it checks each row in the calendar table to see if the date of the current row in the Calendar table is on or after the MAX of the first day off (line 6) and also if the date is before the return date (line 7).  If both of these things are true, then FILTER keeps the date in the filtered copy of the calendar table.
  • After iterating the entire calendar, FILTER returns a filtered copy of the Calendar table that contains just the rows needed for the calculation.  Some of these days will be working days (indicated with a 1) and some will be weekends or public holidays (indicated with a 0).
  • The last step is for CALCULATE to add up this “working days” column (as currently filtered) to work out how many working days the employee was away from work.

Here is my workbook if you would like to download it and take a look (Excel 2016 format). working-days-off-work

  Subscribe to PowerPivotPro!
X

Subscribe

Matt Allington

Matt Allington is a Microsoft MVP specalising in Power Pivot, Power BI and Power Query Consulting and Training based in Sydney Australia. Visit Matt's blog here. 

This Post Has 12 Comments

  1. If you want to connect your calendar table (eg Calendar[Date] and AbsenceTable[Return Date]):

    =CALCULATE(SUM(Calendar[Working Day]);FILTER(ALL(Calendar);Calendar[Date]<=MAX(AbsenceTable[Return Date]));ALL(AbsenceTable[First Day Off]))

    CALCULATE(SUM(Calendar[Working Day]);FILTER(ALL(Calendar);Calendar[Date]<=MAX(AbsenceTable[First Day Off]));ALL(AbsenceTable[Return Date]))

  2. DATESBETWEEN() rather than FILTER() is likely to be a significant performance improvement, though with something this simple I’m unsure if it’s worth it in the formula you’ve posted.

    I’d probably refactor to something like the following (untested, but I’ve used the pattern before):

    SUMX(
    AbsenceTable
    CALCULATE(
    SUM( Calendar[Working Day] )
    ,DATESBETWEEN( Calendar[Date], AbsenceTable[First Day Off], AbsenceTable[ Return Date] )
    )

    This would give you a grand total that behaves appropriately and behave the same for detail rows. It also handles the situation where one employee has two non-consecutive absences.

    1. Hi @Gregory and @Matt,

      I have used the same approach on a project for a research Lab, where weighted averages needed to have some dates removed because of not reaching a certain threshold value. The approach is very performant and scales very well. At the time the mainly historical date was loaded we ended up having some 25K rows of skip-intervals across all measurement IDs/threshold codes/and-a-attribute-I-dont-remember. I tried just for the heck of it to run an average on a mock measurement/trial with unfiltered intervals (this doesn’t really make sense in the real world) and the above DAX measure never missed a beat.

      One thing though. If your calendar table has Mark as date table activated (that was the case on my project) you’ll have to cater for the fact that the DATESINBETWEEN() function will wipe out the filter context of the calendar table so you’ll have to reintroduce it like this:

      =
      SUMX (
      AbsenceTable,
      CALCULATE (
      SUM ( Calendar[Working Day] ),
      DATESBETWEEN (
      Calendar[Date],
      AbsenceTable[First Day Off],
      AbsenceTable[ Return Date]
      ),
      VALUES ( Calendar[date] )
      )
      )

      1. … sorry I didn’t get the last part of the code inserted …

        or this:

        =
        SUMX (
        AbsenceTable,
        CALCULATE (
        SUM ( Calendar[Working Day] ),
        KEEPFILTERS (
        DATESBETWEEN (
        Calendar[Date],
        AbsenceTable[First Day Off],
        AbsenceTable[ Return Date]
        )
        )
        )
        )

  3. Alternative solution, if you want to transform your AbsenceTable is to simply make a fact table of [EmployeeKey], [Date], [Present], [NormalWorkDay]. This would participate in a simple star schema with Employee and Date as dimensions off of the fact table.

    Yes, your fact would be *much* larger in size, but this narrow tall table is exactly the type of table where Vertipaq compression shines.

  4. If you add a Cumulative Working Day running total column to your Calendar table you can simply deduct the value for the Start Date from the value on the End Date to get the Working Days Off value. This is much more performant for Measures (as opposed to Calculated Columns).

  5. /*

    Hi Matt,
    Thanks for another fine lesson in (mostly) PQ. BTW, I see you reference “PQL”; has it caught on?! Anywho… The thing that has me curious is that there are always multiple coding solutions for these universal logic challenges.
    This use case (# “working” days between 2 dates) is as common as the blade of grass. And the calendar/date dim table is the foundation of (almost) all financial reporting solutions.
    Why then are we continuously reinventing the wheel?! Where is the consensus, and I use that word loosely, date table and accompanying functions?! (Russo/Ferrari ?! Collie/Singh ?!)
    For the (technically) elite out there, I believe we are better served by sharing best practices and performance insights and suggestions rather than repetitively redesigning. The lack of proper development environments makes these platforms challenging enouogh, especially PQ.
    Each blog/challenge is potential fodder for some cleverness but the majority are just alternative code products with little or no insight as to why we should use them.
    There are at 5 alternatives here, all worthy, I assume simply becasue they work. But which works best (and what is the criteria?!). Here’s another which directly utilizes your date table found in:

    https://powerpivotpro.com/2015/02/create-a-custom-calendar-in-power-query/

    Lightly tested but you’ll get the gist. Of course you can retro this solution into your Calendar table with an added logical type column

    Thank you again, Matt, I’ve learned much from your blogs. And I thank each of you all for your value-added comments.

    */

    let NetworkDays = ( FirstDateOff as date, ReturnDate as date ) as nullable number =>

    /* This block of code included for convenience of demonstration only; in PROD, WorkingDayList sourced from Excel (or elsewhere); ROC to Holiday table s/b once a year */
    let
    Source = Excel.CurrentWorkbook(){[Name=”Calendar”]}[Content],
    #”Changed Type” = Table.TransformColumnTypes(Source,{{“Index”, Int64.Type}, {“Date”, type date}, {“Day”, Int64.Type}, {“Day Name”, type text}, {“Month Number”, Int64.Type}, {“Month Name”, type text}, {“Quarter Number”, Int64.Type}, {“Quarter Year”, type text}, {“Short Year”, Int64.Type}, {“Year”, Int64.Type}}),
    HolidayList = { #date( 2016, 11, 11 ), #date( 2016, 09, 05 ), #date( 2017, 01, 02 ), #date( 2016, 12, 26 ) }, /* VETERANS DAY, ARMISTICE DAY, CHRISTMAS ’16; New Years ’17 for testing only */
    WeekendsRemoved = List.Select( Table.Column( #”Changed Type”, “Date” ), each Date.DayOfWeek( _ ) 6 /* Saturday */ and Date.DayOfWeek( _ ) 0 /* Sunday */ ), // Remove weekends
    WorkingDayList = List.Difference( WeekendsRemoved, HolidayList ), // Remove holidays but why do I have to Transform AGAIN…grr

    /* Function proper */
    NumberOfDaysOff = if ( FirstDateOff > ReturnDate ) or ( List.PositionOf ( WorkingDayList, FirstDateOff ) = -1 ) or ( List.PositionOf ( WorkingDayList, ReturnDate ) = -1 ) then null
    else
    List.PositionOf ( WorkingDayList, ReturnDate ) – List.PositionOf ( WorkingDayList, FirstDateOff ) as number
    in
    NumberOfDaysOff
    in
    NetworkDays

    1. PQL hasn’t caught on. I’m hoping still that MS will address this soon with anything other than the current approach.

      As for “one best way” to do stuff, I think this will come with a different cost. Whenever there are standards there are committees to review and agree. The overhead can be more than the benefit. There is more than one way up each mountain. Some options are better than others and there are many good options. JMO

  6. Thanks Matt, but using your analogy, some routes up the mountain are dangerous, slow, and dark. 🙂 I’m sure we’ll see more I-I over time. You might consider expanding your solution to include generating comp days based on an employee working holidays and weekends. And to make it a bit more challenging, something other than 1-1, perhaps a stepped or tiered model. Many thanks.

Leave a Comment or Question