Query Folding in Power Query to Improve Performance

By:   |   Comments (11)   |   Related: > Microsoft Excel Integration


Problem

Power Query allows you to extract and manipulate data from various sources. When you define transformations on the data, it is possible that those transformations are sent back to the source to improve performance. This feature is called query folding and it’s very important for in Power Query. In this tip we will discuss query folding, how you can take advantage from it and how to make sure query folding takes place.

Solution

Power Query is a lightweight ETL-like tool inside the Power BI stack. You can find it in multiple software applications:

  • As the Query Editor in Power BI Desktop
  • As the Get Data editor in Excel (previously Power Query was a seperate add-in for Excel)
  • As the Modern Get Data experience in Analysis Services Tabular (since SQL Server 2017)

Most of the screenshots in this tip are made in Excel, however all functionality is very similar between those applications.

Depending on the source, Power Query can send some of the transformations to the source. This means that those transformations take place at the source, before any data is sent to Power Query. This is a big performance improvement, since Power Query has to import less data. This process is called query folding.

However, not all sources support query folding – it’s impossible to push transformations to a flat file for example – and not all transformations can be sent back to the source. For example, filtering rows can be sent back to SQL Server (in the form of a WHERE clause), but a transformation in Power Query that removes the 5 bottom rows cannot.

There isn’t yet an official list of sources that support query folding, but I’ve composed the following list from various resources:

  • Relational sources (SQL Server, Oracle, …). They support most Power Query functionality.
  • OData sources (such as a SharePoint list for example and the Azure Marketplace)
  • Active Directory
  • Exchange
  • HDFS, Folder.Files and Folder.Contents (for basic operations on paths)

There also isn’t documentation available for transformations that can fold back, but these transformations can when using the right source type:

  • Filtering (on rows or columns)
  • Joins
  • Aggregates and GROUP BY
  • Pivot and unpivot
  • Numeric calculations
  • Simple transformations, such as UPPER
  • ...

Using Query Folding

The user interface in Power Query doesn’t tell you if Power Query uses query folding or not. This means you have to use a monitoring tool to capture the statements sent to the source. Let’s it illustrate with a case where we read the data from SQL Server and monitor the queries with SQL Server Profiler.

We’re reading data from the FactInternetSales table from the AdventureWorksDW2012 sample database. The first step is getting the CategoryName into the table. We can do this by expanding the DimProduct column and retrieving the DimSubCategory column.

Expanding Product

Next we expand this newly retrieved column and get the DimProductCategory column. Then we repeat this and retrieve the EnglishProductCategoryName column. The applied steps in the query settings pane should look like this:

Applied steps

Now we do the same with the DimDate (OrderDateKey) column to retrieve the CalendarYear column from the DimDate dimension. Then we filter this column for the values 2007 and 2008. It’s possible the filter menu only shows the value 2005 because the data is sampled.

Filter on year

In that case, click on "Load more" to sample more rows.

Next we are going to choose the columns we want to keep. Click Choose Columns in the home ribbon and select the columns SalesAmount, OrderQuantity, CalendarYear and EnglishProductCategoryName.

Choose columns

The next step is to group by the columns CalendarYear and CategoryName and aggregate on SalesAmount and OrderQuantity. This can be done with the Group By transformation which can be found in the Transform ribbon.

Group By

Furthermore, we are keeping only rows where the sum of orderquantity is greater than 5000.

Filter Aggregates

As the last step, we are going to change the ProductCategory to uppercase.

Set to uppercase

The Applied Steps should now look like this:

Applied Steps

When we load the data into Excel, a query is sent to SQL Server to fetch the data and we can monitor this with SQL Server Profiler. A lot of queries will pop-up, most of them to retrieve metadata, but one query is responsible for fetching the data in our example.

Profiler

The query looks like this:

select [_].[CalendarYear] as [DimDate(OrderDateKey).CalendarYear],
    upper([_].[EnglishProductCategoryName]) as [DimProduct.DimProductSubcategory.DimProductCategory.EnglishProductCategoryName],
    [_].[SalesAmount2] as [SalesAmount],
    [_].[OrderQuantity2] as [OrderQuantity]
from 
(
    select [_].[CalendarYear],
        [_].[EnglishProductCategoryName],
        [_].[SalesAmount2],
        [_].[OrderQuantity2]
    from 
    (
        select [rows].[CalendarYear] as [CalendarYear],
            [rows].[EnglishProductCategoryName] as [EnglishProductCategoryName],
            sum([rows].[SalesAmount]) as [SalesAmount2],
            sum([rows].[OrderQuantity]) as [OrderQuantity2]
        from 
        (
            select [_].[OrderQuantity],
                [_].[SalesAmount],
                [_].[EnglishProductCategoryName],
                [_].[CalendarYear]
            from 
            (
                select [$Outer].[OrderQuantity],
                    [$Outer].[SalesAmount],
                    [$Outer].[EnglishProductCategoryName],
                    [$Inner].[CalendarYear]
                from 
                (
                    select [$Outer].[ProductKey2],
                        [$Outer].[OrderDateKey],
                        [$Outer].[DueDateKey],
                        [$Outer].[ShipDateKey],
                        [$Outer].[CustomerKey],
                        [$Outer].[PromotionKey],
                        [$Outer].[CurrencyKey],
                        [$Outer].[SalesTerritoryKey],
                        [$Outer].[SalesOrderNumber],
                        [$Outer].[SalesOrderLineNumber],
                        [$Outer].[RevisionNumber],
                        [$Outer].[OrderQuantity],
                        [$Outer].[UnitPrice],
                        [$Outer].[ExtendedAmount],
                        [$Outer].[UnitPriceDiscountPct],
                        [$Outer].[DiscountAmount],
                        [$Outer].[ProductStandardCost],
                        [$Outer].[TotalProductCost],
                        [$Outer].[SalesAmount],
                        [$Outer].[TaxAmt],
                        [$Outer].[Freight],
                        [$Outer].[CarrierTrackingNumber],
                        [$Outer].[CustomerPONumber],
                        [$Outer].[OrderDate],
                        [$Outer].[DueDate],
                        [$Outer].[ShipDate],
                        [$Inner].[EnglishProductCategoryName]
                    from 
                    (
                        select [$Outer].[ProductKey2] as [ProductKey2],
                            [$Outer].[OrderDateKey] as [OrderDateKey],
                            [$Outer].[DueDateKey] as [DueDateKey],
                            [$Outer].[ShipDateKey] as [ShipDateKey],
                            [$Outer].[CustomerKey] as [CustomerKey],
                            [$Outer].[PromotionKey] as [PromotionKey],
                            [$Outer].[CurrencyKey] as [CurrencyKey],
                            [$Outer].[SalesTerritoryKey] as [SalesTerritoryKey],
                            [$Outer].[SalesOrderNumber] as [SalesOrderNumber],
                            [$Outer].[SalesOrderLineNumber] as [SalesOrderLineNumber],
                            [$Outer].[RevisionNumber] as [RevisionNumber],
                            [$Outer].[OrderQuantity] as [OrderQuantity],
                            [$Outer].[UnitPrice] as [UnitPrice],
                            [$Outer].[ExtendedAmount] as [ExtendedAmount],
                            [$Outer].[UnitPriceDiscountPct] as [UnitPriceDiscountPct],
                            [$Outer].[DiscountAmount] as [DiscountAmount],
                            [$Outer].[ProductStandardCost] as [ProductStandardCost],
                            [$Outer].[TotalProductCost] as [TotalProductCost],
                            [$Outer].[SalesAmount] as [SalesAmount],
                            [$Outer].[TaxAmt] as [TaxAmt],
                            [$Outer].[Freight] as [Freight],
                            [$Outer].[CarrierTrackingNumber] as [CarrierTrackingNumber],
                            [$Outer].[CustomerPONumber] as [CustomerPONumber],
                            [$Outer].[OrderDate] as [OrderDate],
                            [$Outer].[DueDate] as [DueDate],
                            [$Outer].[ShipDate] as [ShipDate],
                            [$Inner].[ProductCategoryKey] as [ProductCategoryKey2]
                        from 
                        (
                            select [$Outer].[ProductKey2] as [ProductKey2],
                                [$Outer].[OrderDateKey] as [OrderDateKey],
                                [$Outer].[DueDateKey] as [DueDateKey],
                                [$Outer].[ShipDateKey] as [ShipDateKey],
                                [$Outer].[CustomerKey] as [CustomerKey],
                                [$Outer].[PromotionKey] as [PromotionKey],
                                [$Outer].[CurrencyKey] as [CurrencyKey],
                                [$Outer].[SalesTerritoryKey] as [SalesTerritoryKey],
                                [$Outer].[SalesOrderNumber] as [SalesOrderNumber],
                                [$Outer].[SalesOrderLineNumber] as [SalesOrderLineNumber],
                                [$Outer].[RevisionNumber] as [RevisionNumber],
                                [$Outer].[OrderQuantity] as [OrderQuantity],
                                [$Outer].[UnitPrice] as [UnitPrice],
                                [$Outer].[ExtendedAmount] as [ExtendedAmount],
                                [$Outer].[UnitPriceDiscountPct] as [UnitPriceDiscountPct],
                                [$Outer].[DiscountAmount] as [DiscountAmount],
                                [$Outer].[ProductStandardCost] as [ProductStandardCost],
                                [$Outer].[TotalProductCost] as [TotalProductCost],
                                [$Outer].[SalesAmount] as [SalesAmount],
                                [$Outer].[TaxAmt] as [TaxAmt],
                                [$Outer].[Freight] as [Freight],
                                [$Outer].[CarrierTrackingNumber] as [CarrierTrackingNumber],
                                [$Outer].[CustomerPONumber] as [CustomerPONumber],
                                [$Outer].[OrderDate] as [OrderDate],
                                [$Outer].[DueDate] as [DueDate],
                                [$Outer].[ShipDate] as [ShipDate],
                                [$Inner].[ProductSubcategoryKey] as [ProductSubcategoryKey2]
                            from 
                            (
                                select [_].[ProductKey] as [ProductKey2],
                                    [_].[OrderDateKey] as [OrderDateKey],
                                    [_].[DueDateKey] as [DueDateKey],
                                    [_].[ShipDateKey] as [ShipDateKey],
                                    [_].[CustomerKey] as [CustomerKey],
                                    [_].[PromotionKey] as [PromotionKey],
                                    [_].[CurrencyKey] as [CurrencyKey],
                                    [_].[SalesTerritoryKey] as [SalesTerritoryKey],
                                    [_].[SalesOrderNumber] as [SalesOrderNumber],
                                    [_].[SalesOrderLineNumber] as [SalesOrderLineNumber],
                                    [_].[RevisionNumber] as [RevisionNumber],
                                    [_].[OrderQuantity] as [OrderQuantity],
                                    [_].[UnitPrice] as [UnitPrice],
                                    [_].[ExtendedAmount] as [ExtendedAmount],
                                    [_].[UnitPriceDiscountPct] as [UnitPriceDiscountPct],
                                    [_].[DiscountAmount] as [DiscountAmount],
                                    [_].[ProductStandardCost] as [ProductStandardCost],
                                    [_].[TotalProductCost] as [TotalProductCost],
                                    [_].[SalesAmount] as [SalesAmount],
                                    [_].[TaxAmt] as [TaxAmt],
                                    [_].[Freight] as [Freight],
                                    [_].[CarrierTrackingNumber] as [CarrierTrackingNumber],
                                    [_].[CustomerPONumber] as [CustomerPONumber],
                                    [_].[OrderDate] as [OrderDate],
                                    [_].[DueDate] as [DueDate],
                                    [_].[ShipDate] as [ShipDate]
                                from [dbo].[FactInternetSales] as [_]
                            ) as [$Outer]
                            left outer join [dbo].[DimProduct] as [$Inner] on ([$Outer].[ProductKey2] = [$Inner].[ProductKey])
                        ) as [$Outer]
                        left outer join [dbo].[DimProductSubcategory] as [$Inner] on ([$Outer].[ProductSubcategoryKey2] = [$Inner].[ProductSubcategoryKey])
                    ) as [$Outer]
                    left outer join [dbo].[DimProductCategory] as [$Inner] on ([$Outer].[ProductCategoryKey2] = [$Inner].[ProductCategoryKey])
                ) as [$Outer]
                left outer join [dbo].[DimDate] as [$Inner] on ([$Outer].[OrderDateKey] = [$Inner].[DateKey])
            ) as [_]
            where [_].[CalendarYear] = 2007 and [_].[CalendarYear] is not null or [_].[CalendarYear] = 2008 and [_].[CalendarYear] is not null
        ) as [rows]
        group by [CalendarYear],
            [EnglishProductCategoryName]
    ) as [_]
    where [_].[OrderQuantity2] > 5000
) as [_]

It’s a bit unwieldy and in fact a lot of columns of the inner subqueries are unnecessary because we filtered them out in a later step. But you can see that all of the transformations are present in the query:

  • The final SELECT only selects the columns we need
  • Expanding to the ProductCategory and Date dimension was translated into LEFT OUTER JOINS
  • The function UPPER is used on the ProductCategory column
  • The filters on CalendarYear on OrderQuantity are implemented as WHERE clauses
  • The Group By obviously became a GROUP BY in SQL and the corresponding aggregates are calculated with SUM of course

Preventing Query Folding

Some actions can prevent query folding from taking place. Sometimes it’s even possible you want to prevent query folding for some reason. These are the most common reasons why query folding isn’t taking place:

  • Using a source that doesn’t support query folding.
  • Using the .Buffer() M function, either on a list or on a table. This function reads all data into Power Query.
  • Using a custom SQL statement. If you write your own SQL statement to fetch the data, any subsequent transformation on that data will not use query folding.
  • Some transformations. For example, filtering with the date filter “This Month” prevents query folding as in the current release there is no SQL equivalent filter implemented yet. This might change though in future releases.
  • Some privacy level settings can prevent query folding.
  • Using “Removing rows with errors” prevents query folding.
  • Defining and using your own functions in Power Query can also prevent query folding.

Let’s illustrate the date filter problem with an example. When reading the FactInternetSales table, expand the DimDate (OrderDateKey) to retrieve the FullDateAlternateKey. Filter this column for This Month.

Date filter

Next remove all columns except OrderQuantity and FullDateAlternateKey. The Applied Steps look like this:

Applied Steps

When we take a look at Profiler, we can see that no transformations were pushed down to SQL Server, even though removing columns can be folded back.

Full query in profiler

Power Query will read the entire fact table and then perform the transformations inside its own engine.

It’s important you try to do steps first where query folding can take place and that you put steps that prevent query folding as late in the chain as possible, in order to maximize performance. If one of your first steps prevent query folding, all the subsequent steps will not be folded back.

Update - Native Query

In the most recent versions of Power Query, the option has been added to view the "native query". This means you can view for example the generated SQL statement without running a trace as showed in the previous sections. Checking if query folding takes place is simply done by right-clicking on a step in the query editor and by verifying if the "view native query" option isn't greyed out.

view native query

If "View Native Query" is greyed out, it means query folding isn't taking place for that step. The tip Power BI Native Query and Query Folding goes deeper into this concept.

Conclusion

In this tip we introduced the concept of query folding in Power Query and how important it is for performance. We showed when it can take place and what possible limitations there are. Important to remember is putting steps that can be folded back to the source first in the list of applied steps. Steps that prevent query folding should be applied as late as possible.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Koen Verbeeck Koen Verbeeck is a seasoned business intelligence consultant at AE. He has over a decade of experience with the Microsoft Data Platform in numerous industries. He holds several certifications and is a prolific writer contributing content about SSIS, ADF, SSAS, SSRS, MDS, Power BI, Snowflake and Azure services. He has spoken at PASS, SQLBits, dataMinds Connect and delivers webinars on MSSQLTips.com. Koen has been awarded the Microsoft MVP data platform award for many years.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, December 14, 2020 - 9:23:20 AM - Koen Verbeeck Back To Top (87909)
Hi Annie,

I'm not sure. Your question is not related to the content of this tip, so perhaps posting your question in the Power BI forums might get you the answer you need?

Regards,
Koen

Monday, December 14, 2020 - 2:14:51 AM - Annie Keshvari Back To Top (87907)
Can I do incremental load for excel files in power bi

Wednesday, March 27, 2019 - 4:58:34 AM - Koen Verbeeck Back To Top (79401)

Hi Jon,

the joins, are those created in the Power Query editor itself? Meaning you are merging two queries and choose inner join as join type?

Regards,
Koen


Friday, March 15, 2019 - 5:32:07 AM - Jon Back To Top (79303)

 Hi,

For query folding, does it work for all join types. I've found that only left outer joins query folde, if it is an inner join it disables it?


Thursday, January 25, 2018 - 11:32:14 AM - edh Back To Top (75037)

Great article - appreciated. I know this was written in 2015, but the Power Query interface does now tell you when a query will be sent back to the server using query folding and when it doesn't. In the steps, just right-click on a step and select "View Native Query." If it is grayed out, then that step isn't supported, and you have to move up steps unti it is no longer grayed out and will show you the actual query.

This has been usefulf or me as I'll rearrange steps to put as much back on the SQL server as possible, and makes it clear what is and isn't supported.


Wednesday, January 25, 2017 - 2:17:28 AM - Koen Verbeeck Back To Top (45562)

Hi Franz,

thanks for your comment! I'm very glad to know the article had such a positive impact.

Regards,
Koen


Tuesday, January 24, 2017 - 12:51:55 PM - FranzV Back To Top (45549)

Thank you very much.  Just reduced the load time of an SQL connection to a SAP database from 30 minutes to 30 seconds by directly accessing the database instead of using a custom function that saved me three lines of "M" code, but apparently prevented query folding.

 


Tuesday, September 13, 2016 - 3:25:15 AM - Koen Verbeeck Back To Top (43311)

Hi Vivek,

it's possible that merging the two queries resulted in a step that cannot be folded back.

Did you check the resources I linked to? They contain debugging steps where you can check the status of query folding.


Thursday, September 8, 2016 - 10:15:43 AM - Vivek Back To Top (43284)

 When I explicitly apply a filter in the power query step, I do see the filter applied in the MDX query via SQL Profiler. 

However, when I use merge queries i.e. a 2nd table has the values to filter by, I'm not quite seeing any indication that query folding is happenig. 

Any idea how I can confirm that query folding is happening, in the merge query scenario? 

 


Wednesday, August 31, 2016 - 2:57:31 PM - Koen Verbeeck Back To Top (43240)

Hi Vivek,

I quickly tested with Power BI Desktop and SSAS MD, and it seems query folding is supported.
I filtered on a column, and the filter was reflected in the MDX (captured with Profiler).

 

I have no idea though what exactly is supported. You can check the blog of Chris Webb for more info on query folding.
He has some nice posts on how to check if query folding is actually taking place:

https://blog.crossjoin.co.uk/2016/07/26/value-resourceexpression-and-query-folding-in-power-bi/

https://blog.crossjoin.co.uk/2016/08/02/another-way-to-check-query-folding-in-power-bipower-query-m-code/

Regards,
Koen


Monday, August 29, 2016 - 9:27:11 AM - Vivek Back To Top (43204)

 Will query folding occur when using SSAS as the source?

 















get free sql tips
agree to terms