Last time, we began a deep dive into the inner workings of query folding. We examined how you can implement foldability using
Table.View, ending with a firm grasp on why answering the question “what functions fold?” isn’t simple, but rather depends on the Power Query version, the data connector and possibly even some combination of the particular operation’s parameters and the data set being accessed.
But this isn’t the only “moral of the lesson” to be gleaned from our query folding deep dive….
As part of processing an expression, do you think Power Query communicates just once with each external source? For that matter, does Power Query process your expression verbatim and exactly one time? On query folding: Is it guaranteed to be transparent, producing identical results regardless of whether an expression is processed locally by Power Query, partly folded to source or fully folded to source?
These questions, and their answers, will lead us to more morals to be learned from this continuation of our lesson on query folding!
Query folding, by now, is a concept you’re likely already familiar with. In short, Power Query’s query folding takes an M expression and translates part or all of it into the data source’s native query/request language (for example, translating M into T-SQL or into an OData request), then asks the source to execute that native request. You wrote M, but query folding offloaded some portion of your expression’s processing to the external data source by rewriting M logic into the source’s native language.
Query folding is a key concept in the Power Query ecosystem. It brings the potential for extremely significant performance benefits. Thanks to mechanisms like indexes and in-memory paging of data, the data source often can find the data of interest much more efficiently than when raw rows are streamed from the source to Power Query for local filtering. The source may also be able to perform other processing, such as aggregation and joins, again with much better performance than Power Query can locally. In addition to these benefits, offloading execution to the source usually reduces the quantity of data that needs to be sent across the wire to Power Query. For reasons such as these, query folding as much as possible is usually more efficient (and so quite desirable) in contrast to having Power Query internally handle all the processing itself.
Perhaps surprisingly, considering its importance, query folding is not part of the M language. You could write your own M mashup engine which is 100% compliant with the language specification without even knowing that query folding exists. How could such a key Power Query concept not be part—a prominent part!—of the language specification?
It doesn’t need to be.
Let’s take another look at query folding. This time, instead of focusing on what it is or why it is advantageous (topics we’ve touched on in the past—see parts 5 & 12), let’s explore how it works. We’ll do this by looking at the general concepts involved with how folding is implemented using
The knowledge we gain should help make folding much less mysterious—which should help us write better queries and debug query folding problems. As a bonus, if we ever decide to try custom connector development or feel the need to override (or augment) an existing connector’s folding, what we learn here should serve as a useful starting place.
Let’s get going!
As we learned last time, normally, M code is evaluated in a global identifier resolution scope consisting of all shared members + the standard library. Also, normally, we can’t inject additional identifiers into this global environment. Normally isn’t always. Today, we learn about the exception: where both of these normalities do not apply.
That’s not all: Did you know that M has a mechanism for remembering how to access variables that later go out of scope? Closures open up powerful options, particularly when generating functions…and even enable building an object-like programmatic construct that maintains internal private state and is interacted with through a public interface (kind-of, sort-of somewhat like an object from object-oriented programming!).