Blog Post

Include File Name in Content Using Power Query

,

When importing data from a flat file using Power Query, you’d want to combine multiple files and include file name in the resulting table. For example, in Analyzing Baby Names using Power BI I posted recently, the file name contains year that I needed to include in the table. Here’s how you do that:

Step 1: Create the query for one file

First, you need to create the query to pull data from one flat file. To do this, click on “Get Data”, select “CSV”, and browse to the file. If applicable, select “Use First Row as Headers” in home tab and/or rename columns to use meaningful names. You can rename columns by double-clicking on the column and giving a desired name.

At this point the table should appear as shown below:

get data from file using power query

And the query should be similar to the one shown as follows:

query editor get data from file

Step 2: Convert the query to function

Then, convert the above query to a function. Simply add a new line at the beginning which accepts an input argument called “myfile”, and replace the file name including the double quotes with the input argument. Rename the query, I chose ReadFileContents, and click “Done”.

By doing this, we created a function that accepts file name as input and returns the file contents.

convert to function power query

Step 3: Add new query pointing to the folder

Now, create another query and “Get Data” from a folder as shown below. This’ll automatically combine data from files in that folder.

select from folder power query

In the “Folder Path”, specify the folder that contains your files and click “Ok”.

from folder path

The table should look like the one shown below with file properties as columns.

from folder power query

Binary in the Content column contains the actual file contents. Resist the urge to click here just yet. Instead, remove all columns other than Name and Folder Path. To do this, Select both these columns at the same time, and select “Remove Columns” and click “Remove Other Columns” from the drop-down, as shown in the following image.

removing columns power query

Step 4: Add Custom column using the function

Now let’s add a new column by selecting “Add Column” and “Add Custom Column”. In the formula window, enter “= ReadFileContents([Folder Path]&[Name])”. We’re essentially calling the function – once for each file – we created earlier while passing concatenated folder path and name.

using function in custom column power query

At this point, the table should appear as shown below.

added custom column power query

Step 5: Expand

We’re almost done. Click on the “Expand” icon (arrows that apparently hate each other, why else would they be not looking into each other’s eyes) towards the right of the “Custom” column. This will expand the contents of the Custom column and display all columns as shown below.

final table with file name power query

Step 6: Final Cleanup

Finally, you can optionally add additional steps to clean the table such as removing “Folder Path” if it’s not required anymore and formatting “Name” column as desired. I used “Replace Text” to get rid of “yob” and “.txt” and include only the “Year”.

Summary

This is how you can combine files along with file name in Power Query. This will be useful in several occasions where the file name – either in full or in part – contains key dimensional information such as dates, locations, employees, etc.,that’s relevant in your analysis.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating