Why should you concern yourself with how Power Query “thinks” about tables? After all, you write an expression that outputs the table you want, the mashup engine executes it and everyone is happy without you having to think about how the engine does its thing…right? Yes—at least until you encounter performance problems, values change during processing or a firewall error bites—then what do you do?
Understanding how M processes tables is an important asset in developing efficient mashups, avoiding unexpected data variability and keeping the data privacy layer happy. Streaming, query folding, buffering, table keys, native query caching and the firewall—all of these relate to how the interpreter thinks about/processes/handles tables.
There’s so much to cover, we’ll split the list in two. Let’s tackle the first half (streaming, query folding and buffering) in this post and save the remainder (table keys, native query caching and the firewall) for next time.
Let’s get going!
Series Index
- Introduction, Simple Expressions &
let
(part 1) - Functions: Defining (part 2)
- Functions: Function Values, Passing, Returning, Defining Inline, Recursion (part 3)
- Variables & Identifiers (part 4)
- Paradigm (part 5)
- Types—Intro & Text (Strings) (part 6)
- Types—Numbers (part 7)
- Types—The Temporal Family (part 8)
- Types—Logical, Null, Binary (part 9)
- Types—List, Record (part 10)
- Tables—Syntax (part 11)
- Tables—Table Think I (part 12) [this post]
- Tables—Table Think II (part 13)
- Control Structure (part 14)
- Error Handling (part 15)
- Type System I – Basics (part 16)
- Type System II – Facets (part 17)
- Type System III – Custom Types (part 18)
- Type System IV – Ascription, Conformance and Equality’s “Strange” Behaviors (part 19)
- Metadata (part 20)
- Identifier Scope & Sections (part 21)
- Identifier Scope II – Controlling the Global Environment, Closures (part 22)
- Query Folding I (part 23)
- Query Folding II (part 24)
- Extending the Global Environment (part 25)
- More to come!
How Data Flows
To start, let’s make sure we have a firm grasp of streaming and query folding.
Imagine you are the mashup engine. How would you execute the below?
let
Source = SomeDataSourceReturningATable,
Filtered = Table.SelectRows(Source, each [Office] = "Chicago"),
Result = Table.FirstN(Filtered, 3)
in
Result
A simple way would be to retrieve all the rows returned by SomeDataSourceReturningATable
and save them in variable Source. Then, take the contents of that variable, figure out which rows pass the [Office] = "Chicago"
test, and save those in variable Filtered. Lastly, grab the first three rows from Filtered, save them in Result and then return Result.
Logical? Yes. Efficient? No. Why not? For one, there’s resource usage: While at most three rows will be output, the system hosting the mashup engine must have enough capacity to store everything returned from the source (which could be billions of rows). Attempting to do this could lead to that system running out of resources (e.g. memory or disk space).
Thankfully, Power Query doesn’t handle table expressions in this simplistic way. Instead, M uses streaming, query folding or a combination of these two techniques. We learned about both back in Paradigm (part 5); in this post, we’ll try to hone our understanding by delving deeper into their details and walking through several examples. If the general ideas of what streaming and query folding are isn’t sharp in your mind, it’s probably worth jumping back to part 5 for a refresher before continuing on.
Streaming
Let’s say M executes the above expression using streaming….
When Result’s contents are requested, Table.FirstN
in the Result step starts by asking the proceeding step (Filtered) for one row of data. When Filtered’s Table.SelectRows
receives this request, it turns around and asks step Source for a row of data, which Source provides. When Filtered’s SelectRows
receives this row, it checks whether it passes Filtered’s [Office] = "Chicago"
test. If so, SelectRows
returns the row to Result; if not, it discards the row then requests another, repeating this process until it finds one that passes the test, which is returned to step Result. Once Result’s FirstN
has received a row, it outputs that row then turns around and asks Filtered for a second row (because it’s looking for a total of three rows). Filtered’s SelectRows
then picks back up where it left off with Source, asking for one row at a time until it finds another that passes the [Office] = "Chicago"
test, which it then passes to Result’s FirstN
, which then outputs it. Lastly, this process is repeated one more time to retrieve the third row FirstN
needs to satisfy how it was programmed.
Each step produces rows one at a time, only requesting as many rows as it needs from the preceding step to produce the requested row. By working with just enough data (vs. the hypothetical simplistic approach we started with which stored the entire output of each step in memory), Power Query is able to handle data sets that are too large to be stored locally and doesn’t waste resources storing rows that ultimately are unnecessary to produce the requested output.
Internal In-Memory Row Storage
Excepting the data provider (which, for performance reasons, might fetch rows in chunks from the external data source), none of the operations in the proceeding example held rows in memory. When a functioned processed a row, it either passed it on or discarded it.
However, this isn’t true for every operations. Let’s say we add a sort step to our example:
let
Source = SomeDataSourceReturningATable,
Filtered = Table.SelectRows(Source, each [Office] = "Chicago"),
Sorted = Table.Sort(Filtered,{{"TotalSales", Order.Descending}}),
Result = Table.FirstN(Sorted , 3)
in
Result
Sorting (generally) requires retrieving all rows from the previous step so that they can be put in proper order. When the above expression is executed, Result’s Table.FirstN
asks Sorted’s Table.Sort
for the first row. To figure out which row to return, Sort
gets all rows from Filtered, sorts them, saves the sorted rows in memory, then returns the first row from the sorted set. Each time Sort
is asked for a subsequent row, it returns the appropriate row from what it has in memory. After Sort
returns a row, it will never need it again, so it can remove that row from its memory (whether or not it purges memory like this is an internal implementation detail—but at least at the theoretical level it’s allowed to do this).
This internal storing of rows in memory is not a persistent cache; rather, it is limited in scope to a single method invocation during a single execution of the query. There is no sharing of these held in-memory rowsets when a function is invoked multiple times, like Table.Sort
is below (both List.Sum(Top3[TotalSales])
and List.Average(Top3[TotalSales])
end up calling it).
let
Source = SomeDataSourceReturningATable,
Filtered = Table.SelectRows(Source, each [Office] = "Chicago"),
Sorted = Table.Sort(Filtered, {{"TotalSales", Order.Descending}}),
Top3 = Table.FirstN(Sorted , 3)
in
{ List.Sum(Top3[TotalSales]), List.Average(Top3[TotalSales]) }
“Which operations hold rows internally?,” you might ask. Table at a time operations, like joins (though not always nested joins), sorts, grouping, pivot/unpivot, are all suspects—and obviously buffering. (Unfortunately, other than for buffering, I’m not aware of documentation officially detailing this so anecdotal evidence gathered from testing and answers to forum posts are what we have to go by.)
Above, the holding of rows was described as “in memory.” Keep in mind that memory can be paged to disk. Working with memory paged to disk is much, much, much slower than working with memory stored in RAM. The memory usage point that triggers paging to disk is environment specific. In some environments, paging starts when a query’s total memory use exceeds 256 MB.
Performance
The order of operations can have a significant impact on how much data must be kept in memory. To see this significance, let’s contrast two variations of an expression. Both that produce the same output but can differ significantly in local resources used.
let
Source = SomeDataSourceReturningATable,
Sorted = Table.Sort(Source, {{"TotalSales", Order.Descending}}),
Filtered = Table.SelectRows(Sorted, each [Office] = "Chicago"),
Result = Table.FirstN(Filtered, 3)
in
Result
let
Source = SomeDataSourceReturningATable,
Filtered = Table.SelectRows(Source, each [Office] = "Chicago"),
Sorted = Table.Sort(Filtered, {{"TotalSales", Order.Descending}}),
Result = Table.FirstN(Sorted, 3)
in
Result
With the first expression, sorting occurs directly after Source, so all rows from Source are held in memory by the sort function. The second expression sorts after the Table.SelectRows
filter so only rows that pass that filter are held by the sort. Say Source contains two billion rows, out of which only 500 that pass the [Office]
= "Chicago"
test. With the first version of the expression, all two
billion rows are held by the sort; while with the second, only 500 rows are
held. What a difference!
Performance
Tip: When streaming is in play, if your query contains steps that hold rows in memory, try placing any applicable filter steps before the row-holding steps. This way, the filtering steps will reduce the quantity of what needs to be in memory.
Query Folding
Streaming can involve pulling lots of rows which are later discarded. In the example we’ve been using, it’s potentially necessary to stream billions of rows from the source to produce the three requested output rows. If, instead, you directly interacted with the source, you could probably tell it exactly what you wanted and it would produce just that.
If the source were a SQL database, you’d write something like:
SELECT TOP 3 *
FROM Customers
WHERE Office = 'Chicago'
ORDER BY TotalSales DESC;
Or if the source is an OData REST API, your request might look like:
GET /Customers?$filter=Office eq 'Chicago'&$top=3&$orderby=TotalSales
In either case, you get back at most 3 results. Potentially billions of rows aren’t sent to you for you to sort through to find the three you want; instead, that processing occurs on the external system (utilizing any indexing or caching it may have) and just the final results are sent back (much less data crossing the wire). It should be intuitively obvious which approach is more efficient. 🙂
Thankfully, M’s query folding offers the ability to leverage the performance of native queries without needing to write them yourself.
Quick recap: Query folding takes one or more Power Query steps and translates them into a native request which is then executed on the source system (again, for a refresher, jump back to part 5, if needed).
With query folding, if our source system is a SQL database, it’s almost as though the example:
let
Source = SomeDataSourceReturningATable,
Filtered = Table.SelectRows(Source, each [Office] = "Chicago"),
Sorted = Table.Sort(Filtered, {{"TotalSales", Order.Descending}}),
Result = Table.FirstN(Sorted, 3)
in
Result
Is internally replaced by M’s interpreter with something like:
let
Result = Value.NativeQuery(SomeDataSourceReturningATable, "SELECT TOP 3 * FROM Customers WHERE Office = 'Chicago' ORDER BY TotalSales DESC;")
in
Result
(Technically, the internal mechanism used may work a bit differently, but as far as producing rows go, the net effect is approximately the same.)
In the proceeding, the native query to use can be statically deduced simply by looking at the expression steps. Power Query’s query folding can also dynamically factor data in when it produces native requests.
For example, take the below expression, which filters data (MainData) pulled from source A using a list of values (FilterData) retrieved from source B.
let
MainData = GetFromSourceA(),
FilterData = GetFromSourceB(),
Result = Table.SelectRows(MainData, each List.Contains(FilterData[ID], [ID]))
in
Result
At first glance, the expression in step Result may not look like a candidate for query folding because it combines data from two sources. Instead, it may seem necessary for the mashup engine to retrieve all data from both sources then apply the Table.SelectRows
filter locally.
However, Power Query can pull data from one source and write that data into the native request it sends to another source. Say FilterData (from source B) contains only a few rows. Power Query might first pull those few rows locally, then decide to push data from those rows into the native request it sends to source A. For example, pretend FilterData contains three rows and those rows’ ID column values are 1, 2 and 3. Power Query’s query folding might execute step Result by first pulling those values from source B then sending a query like the following query to source A:
SELECT *
FROM SomeTableInSourceA
WHERE ID IN (1, 2, 3); -- these values were pulled from FilterData (source B) then written into this query
The above query tells source A just which rows are needed based on filtering data retrieved from source B. By doing this, it avoids fetching rows that ultimately would have been discarded if the Table.SelectRows
filter were applied locally.
This pulling data from one source then pushing it to another can provide performance benefits and pose security concerns. Power Query’s environment has a mechanism for managing the latter—something which we’ll explore shortly.
Folding + Streaming
As we discussed in part 5, not all operations can be query folded. Once a non-foldable operation is encountered in an expression chain, any potentially foldable operations that come after it won’t be folded back into the data source that started the chain because the non-foldable operation blocks that folding.
Query folding does not eliminate streaming; rather it folds certain steps into a native request whose results are then streamed to any subsequent steps that weren’t query folded. To put it another way: With M, tables are always either streamed or query folded then streamed, never just query folded without then being streamed.
Performance
This leads to what may be an obvious performance tip: Try to put all foldable operations before any non-foldable operations so that the maximum amount of processing can be offloaded to the data sources.
Which steps are query folded, as well as how they are folded, can change as the mashup engine improves, as the quantity of data involved changes, as library/data source functions are revised and as security settings are changed (more on the latter shortly). So, there may be times where you may find it advantageous to re-try performance tuning even though you haven’t made any code changes.
Tables Are Not Immutable
A variable that appears to hold a table (or list) actually just holds a handle to the expression that produces the table (or list). When accessed, that handle executes logic which produces the requested data. While handle is immutable throughout the lifetime of the query’s execution, the data returned when it is invoked is not. This is because that data is produced on demand each time the handle is involved. The fact that the data returned is not immutable can result in values seeming to change during the execution of an M query.
The below expression returns a pair of tables. One holds all customers associated with the Chicago office; the other contains the three customers with the largest total sales amounts. Both tables are ultimately pulled from step Source.
let
Source = SomeDataSourceReturningATable,
ChicagoOffice = Table.SelectRows(Source, each [Office] = "Chicago"),
Top3Sales = Table.FirstN(Table.Sort(Source, {{ "TotalSales", Order.Descending }}), 3),
Result = { ChicagoOffice, Top3Sales }
in
Result
Let’s pretend you ran the above and are looking at the rows returned in the first table (ChicagoOffice). In them, you find customer ABC:
CustomerID = 123, Customer = ‘ABC’, Office = ‘Chicago’, TotalSales = 50255
Looking at the rows in the second table (Top3Sales), you also find customer ABC (apparently, it’s assigned to the Chicago office and is one of your top customers):
CustomerID = 123, Customer = ‘ABC’, Office = ‘Chicago’, TotalSales = 62199
Wait a minute! What in the world?! The same customer has a row in each table but the data in those rows is different between the two tables. (The value of TotalSales is different between the two.) How could this happen?
In light of the fact that table and list variables really just hold handles to the expression that produces the table or list, this behavior makes sense. Both ChicagoOffice and Top3Sales were invoked to produce results. When invoked, each expression chain called back to the ultimate data source to get data. Total sales for ABC must have changed between those two calls to the data source (perhaps another sale was processed during the intervening moment of time).
To recap: Variable that seem to “hold” (e.g. produce or output) a table or list really just holds an immutable reference to an expression that, when invoked, produces the desired output—the expression is immutable, but the data returned when it is invoked is not. So, when a particular query pulls from the same source multiple times during execution, there is the possibility that the data pulled could change between accesses.
If this possibility isn’t acceptable, there are two options: rework the expression to eliminate the multiple invocations (which may or may not be possible) or manually cache (buffer) the output.
Please don’t go away thinking that Power Query is flawed because it allows this variability. This potential for variability is a necessary side effect of M not always saving all rows in memory (and it’s a good thing it doesn’t do that!). Instead, M puts you in control: if there’s a point where a data set needs to be cached to provide stability, you have the power to do that. By putting you in charge, you control when the associated resource and performance costs are paid.
Buffering
When you need to stabilize a table (or list), how do you do it? By buffering.
BufferedTable = Table.Buffer(SomeExpressionProducingATable)
BufferedList = List.Buffer(SomeExpresssionProducingAList)
The first time Table.Buffer is invoked, a buffer is loaded by reading all values from the source and saving them in memory. The data in memory is then used to service any accesses to that buffer that occur during the query’s execution (including when multiple method chains in the query reference the same buffer). However, buffers are not shared across separate executions of the same query.
Adapting our Chicago office + top sales example:
let
Source = SomeDataSourceReturningATable,
BufferedSource = Table.Buffer(Source), ChicagoOffice = Table.SelectRows(BufferedSource , each [Office] = "Chicago"),
Top3Sales = Table.FirstN(Table.Sort(BufferedSource , {{ "TotalSales", Order.Descending }}), 3),
Result = { ChicagoOffice, Top3Sales }
in
Result
Above, the first time step BufferedSource is accessed, Table.Buffer will pull all rows from Source and store them to memory. This stable snapshot of data will then be used to service both steps ChicagoOffice and Top3Sales. Both steps will be offered the exact same rows, with the exact same row values, in the exact same order. Then, when the query finishes executing, the buffer is discarded. If the query is later executed again, a new buffer will be populated from Source.
Of course, buffering all of Source could pose a resource problem, depending on how much data in contains. Again, the key is that you are in control: you decide when to use buffering, when the benefits it brings are worth the associated cost.
When you buffer, be consciousness of how much data you’re buffering. Minimize this quantity where possible. Above, if Source is expected to return more than a small number of rows, see if there is any way to apply filtering before buffering. For example, if you know that all large customers have at least $50,000 in total sales, you could limit buffering to customers with sales above this amount or a Chicago office assignment by changing BufferedSource to:
BufferedSource = Table.Buffer(Table.SelectRows(Source, each [Office] = "Chicago" or [TotalSales] >= 50000))Table.Buffer(Table.SelectRows(Source, each [Office] = "Chicago" or [TotalSales] >= 50000))
A small change like this could significantly reduce the quantity buffered, making a vast difference on resource usage.
Performance Thoughts
Now that you hopefully have a solid understanding of streaming, query folding and buffering, let’s test that knowledge by pondering some performance tips. Do the tips below make sense (like why/how each impacts performance)? Does the order they’re in make sense (like why would it be unlikely that you would want to do step 3 before step 1)?
The following are not absolute rules but rather a suggested starting place for addressing performance issues. Please don’t blindly follow these but rather evaluate them in conjunction with an understanding of your context and how Power Query processes tables.
- Order steps that are query folded first. This offloads as much processing as possible to the external data sources. To determine which steps are being folded, in many cases, you’ll need to use a trace tool or check logs to see the native requests generated, as the UI doesn’t always reveal whether/how steps are being folded.
- Next, for those steps that cannot be folded, try for the following order: first filters, then operations that do not internally hold rows in memory, followed by any remaining operations. This way, you discard unneeded rows before performing additional processing or in-memory storage.
- When buffering is necessary, buffer late in the chain of expressions. Why pay the cost for buffering before that data really needs to be buffered?
- If the above steps don’t produce the desired performance or cost too much in terms of caching and/or buffering resources (e.g. you run out of memory!), hand-crafting a native query which you then have Power Query execute is an alternative. Power Query’s query folding supports a limited set of possibilities; you may be able to hand-code a native request that incorporates logic that folding doesn’t know how to produce and so achieve performance or eliminate buffering in a way that isn’t possible using automatic query folding.
Since data, library/data source function, security and environment changes can also effect performance, there may be times where you may find it advantageous to re-try performance tuning even though you haven’t made any code changes.
Next Time
This post, we made good progress exploring how M “thinks” about tables at the language and library levels. Next time, we’ll continue by discussing table keys. Along with that, the plan is to explore two environmental factors, native query caching and the data protection layer/the firewall, that impact how M processes tables. We’ll also talk about why native queries may be executed more times than you’d expect.
Until then, have fun mashing up data!
You say this in the last section: “To determine which steps are being folded, in many cases, you’ll need to use a trace tool or check logs to see the native requests generated”.
Can you explain what trace tools to use and what logs to check?
If you’re using Power BI, you could try the new Query Diagnostics feature. Otherwise, try looking in the source system’s logs/trace tools. For example, if the source is Microsoft SQL Server, run a SQL Profiler trace or Extended Events session, then run your Power Query, then check what SQL queries were captured by the Profiler trace/Extended Events session). If the source is a web service, you could look in the web service’s logs or use a tool like Fiddler, WireShark or Postman to intercept the calls between Power Query and the web service.
Can I make an entire query/let-in block to be buffered? Executed once, then the result reused, without querying the source again and executing all the steps? I mean, I want the table returned by a query to be stabilized, not just a step in a query. Is this possible? I am trying something like this:
Does not seem to work as I expect. Please advise.
Buffers are specific to a single execution. Say the above query is named A and is referenced from queries BC. When those queries are executed, B (paired with A) will be one query execution and C (paired with A) will be another query execution. Since B+A and C+A are separate executions, buffers won’t be shared between them.
What if my source system is weak (poor query performance/bad indexes etc.)? What if I want to fully utilize the oodles of memory that Power BI has at its disposal? I may not want query folding to happen in such scenarios. I would want to switch query folding off or on, and to (easily) see the performance impact from either option. Maybe the advice to encourage query folding could be a bit more nuanced.
Really awesome stuff Ben!
“Which steps are query folded, as well as how they are folded, can change as the mashup engine improves, as the quantity of data involved changes, as library/data source functions are revised and as security settings are changed (more on the latter shortly). So, there may be times where you may find it advantageous to re-try performance tuning even though you haven’t made any code changes.”
Regarding changes in the engine. How do you plan to update the content. Will you add new content at the end or revisit existing section?
Thanks for the kind words!
When I update a Primer section, I typically add a “Revision” note (with date) at the bottom of the page briefly describing the update made.