Intro by Avi: We know that Power Pivot and Power Query are extremely useful tools in the Finance and Accounting world. Today we have a guest post by Darren Leitzke outlining a very useful scenario and sharing some good tips and tricks. Take it away Darren…

Working in accounting reminds me of all of those movies that take place 40 years later, but when that year comes around, the movie grossly exaggerated the technological advances of the future. Likewise, I thought that once I made it to a large corporation that they would have all of the cool tools: the newest version of every software, the most up-to-date computers, the fastest databases, and of course more help desk support than I could ever use. When I worked behind the scenes on-campus, I thought to myself “This is a Public University. Of course, it is going to have older software.”

You want me to use Excel 2003?

You want me to use Excel 2003?

When I got to a large corporation, I thought that using Excel was just for the departments that were “behind”. It took awhile to realize that it was the most valuable tool I would use in my career. Excel skills weren’t just a nice-to-have, they were a need-to-have and the more that I learned about Excel, the more valuable I felt in the organization. Give me a wireframe and some time and I could create an application that turned data into valuable information for the managers running the business — and it didn’t take me months. Programming was now the nice-to-have.

Excel was still lacking some capabilities, so I was very excited to hear about Power Pivot. That led me to this site and this site led me to Power Query. Power Query is a game-changer and has saved me from hours of repetitive copying and pasting as well as macro recording and writing. But you may not be using it because you’re thinking, “Power Query is only for clean databases or table data… We don’t even have CSVs yet!”

Door #1 or Door #2?

You have two options when dealing with legacy reporting:

  1. Get IT to help you with the source data and figure out how to re-create the information
  2. Get Power Query and turn your text file into a Pivot dream

This post is going to address Option 2 for anyone that has fixed-width legacy reports lying around the organization. (Either way, you probably need IT, so be nice to those guys!)

Enough chit chat… Let’s light this candle!

You’ll need the following to repeat this exercise:

1 – Power Query or Power BI Desktop

2 – Data in a fixed width format (available here)

3 – Patience (or at least focus)

Here is what our file look like:

image

1…2…3…4…Don’t want manual work no more!

You need to start by opening the text file in Notepad (or some other text editor) and count the spacing between words for the columns. For example, in the following sentence, you would count 0, 10, 10…

Darren    Wrote     This post

There are 10 characters from the start of Darren to the start of Wrote and there are 10 characters from the start of Wrote to the start of This post.

Counting characters in a text file

We can use Power Query to clean up the data, so you could also do something like 8, 10 (as long as you put the splits on white space instead of in the middle of data). For the sake of this example, I already did the counting for you (just think of it as a very, very early Christmas present).

The columns for this example should be split at:

0, 13, 32, 60, 74.

Opening the source

We will start with opening Power BI Desktop, choosing my sample file (you’ll need to extract it from the zip archive because the server won’t let me upload a PRN file), and editing the data in the Query Editor

Open PowerBI desktop and choose Get Data

Open Power BI Desktop and choose Get Data

Choose More to get to the text options

Choose More to get to the text options

 

Choose Text and then Connect

Choose Text and then Connect

Open the file and choose Edit

Open the file and choose Edit

All of that work will be worth it soon…

Next, we’ll split the columns using the wizard and the Advanced Editor. This is probably the most value-added feature that we’ll look at, especially if you have more than a few columns.

Choose Split Column and By Number of Characters

Choose Split Column and then By Number of Characters

Type in 13 and choose Once as far left as possible

The first one we will do with the wizard and then I will show you a fancy trick to split as many columns as you want and rename the headers at the same time. Enter in 13, choose Once, as far left as possible, and choose OK.

Power Query automatically tries to figure out the type which is really handy in most cases, but not this case.

Remove the Changed Type step

Choose the X next to Changed Type to remove that step

Advanced Mode… Activate!

This is the fun part that will save you a lot of mouse clicks in the future.

Open the Advance Editor

Open the Advanced Editor

Locate the applicable #Split Column by Position formula

Find the line called #”Split Column by Position” and copy the entire line

It should look like this:

#”Split Column by Position” = Table.SplitColumn(Source,”Column1″,Splitter.SplitTextByPositions({0, 13}, false),{“Column1.1”, “Column1.2”})

A few things to note —

  • The 0 is already included (thanks Microsoft!)
  • The column names are pretty worthless
  • With the default wizard, you would have to repeat this step over and over again until you split all of the columns

Remember that earlier we had figured out the column splits (0, 13, 32, 60, 74)? All of that hard work will now come in handy.

We want to modify the code, so that it looks like this (emphasis added for what we changed):

#”Split Column by Position” = Table.SplitColumn(Source,”Column1″,Splitter.SplitTextByPositions({0, 13, 32, 60, 74}, false),{“Account“, “Description”, “Language”, “Debit”, “Credit”})

Now, close the Advanced Editor by choosing Done and you should have 5 split columns with the headers that we want!

Totally worth it!

You just learned how to split columns by number of characters, rename multiple columns all at once, use the advance editor, and you’ll never have to use that Text to Columns wizard again!

No more Text to Columns!

Let’s keep going!

One of the main annoyances of fixed-width files is that they repeat the header information. Really useful for static reports, not so useful for data. We’ll take care of that now by adding a Custom Column and revisiting the Advanced Editor.

On the Add Column tab - add a Custom Column

Go to the Add Column tab and choose Add Custom Column

More M language magic

In the Add Custom Column window, we are going to use the following code:

if Text.StartsWith([Account],”Division:”) = true then Text.Range([Account], Text.PositionOf([Account],” “)) else null

This code in plain English says, “If the value in the Account column beings with “Division:” then return any text after the first space. Otherwise, return a null.”

Here is the result:

Results of the Custom Column

You can use the same logic with the Page or the Location, you just need to change where it says

Text.StartsWith([Account],”Division:”)

To

Text.StartsWith([Account],”Location:“)

Here are the results (if you go this route, you might want to change the column header):

Results of the Location change

Three questions might be popping into your head:

  • What if there isn’t a space? Use a colon or some other character (or group of characters) instead.

if Text.StartsWith([Account],”Division:”) = true then Text.Range([Account], Text.PositionOf([Account],”put the symbol here and keep the double-quotes“)) else null

  • Why nulls? Because of the next step which is really powerful.
  • This seems like a lot of work. Is it really worth it? If you ever have to do this report twice, it is worth it. You’re also learning a lot of skills that may be valuable to you in the future. A good data monkey is good to have anywhere.

Time to use the null

As far as I can tell, you need a null to use Fill Down. Fortunately, we took care of that with the last step. Next, we will Fill Down the Division.

On the Transform tab, fill down

Go to the Transform tab, choose Fill and then Down

This transformation will fill in the division to the bottom of the data. Here are the results:

Results of filling down

Get rid of useless data

It’s time to finally remove all of the formatting and whitespace. Luckily, there is an account number next to every transaction that starts with 1. We will use that to filter out the data.

Filter text beginning with something.

Choose the Filter button next to Account, Text Filters, and then Begins With…

Enter a 1 and click OK

Enter 1 and choose OK

Here are the (beautiful) results:

Beautiful results of filtering

Last steps before analysis. You’re almost there!

Now, we just need to put on the finishing touches. After changing the types and combining the debits/credits, we should be ready for analysis

. Changing types

Select both the Debit and Credit columns. Choose Data Type: Text and then choose Decimal Number.

Using the Add function for a new column

Go to the Add Column tab, choose Standard, and then Add. You’ll have a new column called Sum. Feel free to delete the Debit and Credit columns.

You can multiply the new Sum column by -1, so that Sales Revenue is positive (instead of a negative to represent the credit balance).

Making Credits positive and Debits negative

Go to the Transform tab, select the Sum column, choose Standard, and then Multiply.

Multiply new column by -1

Type in -1 and click OK

Sweet, sweet victory!

Close and Apply!

You made it! Close & Apply from the Home tab.

Well… Almost!

If you want to move up in your analytical career, you should learn this mantra: Reporting the numbers is not enough. After you get that fancy, beautiful report and walk up to the manager to say, “Look how pretty this is!” The first thing he or she will probably ask is, “OK — What does it tell me?”

Be ready to answer that question BEFORE you ever show someone the report. You don’t have to find some amazing insight that no one has ever thought of before, but you should at least have one or two basic results that the report lets you know about.

The good news is that all of our book categories have revenue.

Good News - People are buying our books

The bad news is that the random number generator I used to generate the revenue and returns shows that more books were returned than sold! For even more credit with your manager, try to come up with a couple of suggestions (but that’s for another guest post).

No profit in this business

Final Tips

  • Power BI Desktop and Power Query have the same advanced editor (and engine). If you ever need to reuse this code, just go into the advanced editor, copy the entire code, and paste it into the Power Query advanced editor.
  • Copy and paste the code from the advanced editor into OneNote, a text editor, or maybe even GitHub! This will allow you to share the code and create a backup incase the file gets corrupted.
  • Instead of saving this report out of your legacy system every time you want to use the report, work with IT to have it automatically saved in a network folder as a Month-to-Date or Quarter-to-Date file. That way, you can look at trends in the data just by refreshing!
  • If you have too much white space before or after the data, you can use Format > Trim in Power Query from the Transform tab to clean it up.

Using Trim

Thank you for reading!

  Subscribe to PowerPivotPro!
X

Subscribe

Darren Leitzke

Darren is a Financial Analyst using PowerBI to drive financial results and trying to convince everyone that will listen that it is the future of reporting. 

This Post Has 12 Comments

  1. Hi Darren, thank you for the article.

    I have had many problems importing .txt files into PQ since even though I specify it is a text file, PQ then “Open File As: CSV”. It happened the same when I tried to import your file:

    View post on imgur.com

    Sometimes there is no difference, but others I loose lots of rows or there is a significant difference on how it automatically applies the delimiter. Have you had this problem?

    Kind regards,

    Fernando

  2. Hi Fernando,

    I have not seen that issue, but you should be able to change it to Open File As Text File and then 1252: Western European (Windows).

    Kind Regards,

    Darren

  3. The split column by position is very useful to be able to do in one instruction, however, this only worked if the splits increased (as they do in your example) when I used a split that decreased it did not work – i.e. I used 0, 15, 20, 9, 16 – the query fell over due to the split 9.
    Do you know if there is a way to overcome this?

    1. Hi Mark, I think that the way you need to think about this as the difference between the numbers equals the split point.

      In my example, 0, 13, 32, 60, 74 are the numbers I used. Here is what PowerQuery is doing:

      1) Split Column A starting at 13, create column B
      2) Split Column B starting at 19 (32-13), create column C
      3) Split Column C starting at 28 (60-32), create column D
      4) continue process

      In your example, it wouldn’t make sense to do -11 (9-20). Ultimately, you are taking 1 column and splitting it based on that 1st columns spacing.

      Do you have an example that you are working on?

      Kind Regards,

      Darren

  4. Great info on using multiple columns in the splitter function. One question for that:

    Does anyone know if it is possible to import just a few columns / fixed widths ranges with room in between them?

    We have some text files that are 6000 or more characters wide and a few million long. They store a few hundred rows.
    We currently use a tool that is outdated but can do stuff like “import positions 12:20 as customer, 21:28 as country, 34:40 as employee, 240:246 as amount, etc) without caring about the data between those specified columns. While Power Query always gets the whole line first and then splits them up continously, it would be better to just choose the ones you need.

    It would be possible to write the few hundred rows down but when you only need a handful, this seems like wasted time.

    1. Hi Frank — The only two ways I can think of doing that would be:

      1) Do the splits as usual and remove the extra columns
      2) Split out the first column using a text extraction function (such as MID() in Excel)) and just change the starting point/number of characters for each new column

      -Darren

  5. Last steps before analysis doesn’t appear to work anymore. The Standard Add step creates = Table.AddColumn(#”Changed Type”, “Inserted Addition”, each [Debit] + [Credit], type number) with no Sum column and only returns a column of null.

Leave a Comment or Question