RowExpression.From/ItemExpression.From

, , , ,

Power Query’s RowExpression.From/ItemExpression.From function (both names reference the same underlying function) provides a way to learn about what a single-parameter function does by outputting an abstract syntax tree (AST) describing it.

Why might you want to use a programmatic structure like an AST to analyze the logic of a function’s body instead of simply invoking the function?

Well, one reason may be that you are implementing query folding in a custom connector. You might want to translate the filter predicate function passed to Table.SelectRows, or the generator function passed to Table.AddColumn, into the upstream data source’s native query/language. In either case, you don’t want to invoke the passed-in function; instead, you want to understand its behavior so that you can factor it in as you build an equivalent native request/query. RowExpression.From/ItemExpression.From is tailored for this purpose.

Unfortunately, this function is little documented—but it is time for that to change!

(Note: For simplicity, the below will refer to this function by the name RowExpression.From. However, ItemExpression.From is an equally valid way to reference the function.)

Basics

Input

Intended input: A single parameter function. Parameter name does not matter.

Note: Arguably, RowExpression.From should error when used with anything other than a one parameter function. However, it doesn’t. Instead, it renders out a significantly less useful AST that does not fully comply with the below.

Output

An abstract syntax tree (AST) describing the passed-in function’s logic. This AST will be simplified and normalized (more on this in a bit).

Exception: If RowExpression.From is asked to produce an AST for a function that exceeds its capabilities, it will raise an error. This will occur if the function contains logic that cannot be represented using the set of node types supported by RowExpression.From. For example, attempting to use RowExpression.From to get an AST for “each …” will result in RowExpression.From dying with an error because it does not support a node type that represents the raising of an error.

Each AST node is represented as a record which will always have a field named Kind which identifies the node’s type. The specific kind determines which other fields are included in the record.

AST Node Types Supported by RowExpression.From/ItemExpression.From

Constant

RowExpression.From(each 123)

Output:

[
  Kind = "Constant",
  Value = 123
]

In this case, the node type is Constant. Nodes of this type have just one other field, Value, which contains the constant value represented by the node.

The AST above tells you that the function’s entire body (i.e. what it will return if it were invoked) simply evaluates to the value 123.

Field Access

RowExpression.From(each [ID])

Output:

[
  Kind = "FieldAccess",
  MemberName = "ID",
  Expression = RowExpression.Row / ItemExpression.Item
]

Here, the function is defined as accessing field ID on the record that is passed in as its single parameter.

In the resulting AST node, the name of the field being accessed is found in field MemberName.

Expression is interesting. It contains an AST node representing the expression that should produce the record on which the specified field is to be accessed (in this case, that record is to be produced by invoking a function).

In the normalized ASTs produced by RowExpression.From, if the field access is being performed on the function’s single parameter, its Expression will equal RowExpression.Row (a.k.a. ItemExpression.Item—both point to exactly the same value). So, to check whether the given node represents a FieldAccess that reads a field on the function’s input record, check that both the node’s Kind = "FieldAccess" and its Expression = RowExpression.Row.

Field access isn’t always performed directly on the input record, so Expression won’t always equal RowExpression.Row. Instead, Expression can be a node of any type that can represent an expression that returns a record (e.g. a node for a function invocation, for another field access, or for an if statement) .

For example:

RowExpression.From(each [Person][ID])

Outputs:

[
  Kind = "FieldAccess",
  MemberName = "ID",
  Expression = [
    Kind = "FieldAccess",
    MemberName = "Person",
    Expression = RowExpression.Row (a.k.a. ItemExpression.Item)
]

Above, notice that the root node represents accessing field ID on the output of an Expression that represents accessing field Person on the function’s input record.

Shortcut: If you want to check whether a node represents accessing a field whose name you already know on the record that is passed in as the function’s input, you can skip coding up a comparison against the node’s Kind, Expression and MemberName and instead simply compare the node to RowExpression.Column(expectedName).

RowExpression.From(each [ID]) = RowExpression.Column("ID") // true

Binary Operations

A common scenario is to compare one value with another.

RowExpression.From(each [ID] = 123)

Outputs:

[
  Kind =  "Binary",
  Operator = "Equals",
  Left = (same FieldAccess node as the first field access example),
  Right = (same Constant node as we saw above)
]

Abstract syntax trees are, as their name states, tree structures. Up until now, the ASTs we’ve examined haven’t looked like trees because they haven’t had any children. With this last example, we now see a tree with more than one level!

The node above node represents the indicated Operator being applied between the node on the Left and the one on the Right.

In the ASTs returned by RowExpression.From, from what I can tell Operator can be any of M’s binary operators, except for as, is or meta—specifically, they can be:

M OperatorOperator Node’s Operator Field Value
=Equals
<> NotEquals
GreaterThan
LessThan
>=GreaterThanOrEquals
<=LessThanOrEquals
+Add
-Subtract
*Multiply
/Divide
andAnd
orOr

If

In Power Query’s ASTs, an if statement holds three child nodes: one for the condition, one representing what to do if that condition evaluates to true, and the last representing what to do if it is false.

RowExpression.From(each if [ID] = 123 then "good" else "bad")

Outputs:

[
  Kind = "If",
  Condition = (same node as the binary operations example),
  TrueCase = (constant node for "good"),
  FalseCase = (constant node for "bad")
]

Invocation

This node type represents a function invocation.

RowExpression.From(each Number.ToText([ID]))

Outputs:

[
  Kind = "Invocation",
  Function = [Kind = "Constant", Value = Number.ToText] ,
  ArgumentList = { (same FieldAccess node as the first field access example) }
]

Function contains a node representing the particular function being invoked. Typically, after ensuring that it is of Kind = "Constant", you would look up its Value in #shared to get its name, then decide what to do based on that name (e.g. if you are translating an expression into the native query/request language of an external data source, you’d use the function’s name to determine the appropriate native query syntax to use).

ArgumentList is a list of zero to many AST nodes representing the arguments specified for the function invocation.

Important: The function is not actually invoked as part of outputting the AST, so the arguments included in its AST node have not been validated—the number of arguments provided and the types of their values have not been checked to determine whether they align with the function’s signature.

Unary

Cousin to binary operators are unary (one-operand) operators. M has three, two of which are supported by RowExpression.From. Below is an example:

RowExpression.From(each -[ID])

Outputs:

[
  Kind = "Unary",
  Operator = "Negative",
 Expression = (same FieldAccess node as the first field access example)]
]
M OperatorOperator Node’s Operator Field Value
+Bug: Currently not supported by RowExpression.From.
Negative
(Note: Used to negate a single value; not for subtracting two values, which would instead be binary operator Subtract)
notNot

Note: Microsoft’s documentation for RowExpression.From/ItemExpression.From states that an additional node type of NotImplemented can be returned. This statement is incorrect and should be removed soon.


Simplified, Normalized AST

By nature, an abstract syntax tree is an abstract representation of the logic expressed by some code. It does not represent verbatim every detail contained in the code’s raw textual syntax. For example, white space from the original text source code is normally left out of an AST. Also, the effect of parenthesis can be reflected by how nodes are arranged, so parenthesis do not need to be directly included as nodes in an AST.

Simplified

In addition to being abstract, the ASTs output by RowExpression.From (a.k.a. ItemExpression.From) are simplified. This AST-outputting method is targeted at the use case of easily understanding the net effect of an expression that is applied to a record (which often represents a row), not the individual steps required to come up with that net effect. To this end, when a subset of steps in the input function’s body does not depend on the function’s input parameter, their value may be precomputed. Then, instead of that subset of steps being represented as individual nodes in the AST, they will be replaced with—simplified to—just a single constant node containing the result of their computation.

In general, if a complex expression (or portion thereof) can be reduced to a constant, this simplification will be reflected in the AST output by RowExpression.From.

For example:

  • each 2 * 5—Will be simplified to a single constant node of 10, not represented as a binary multiplication node with left = constant 2 and right = constant 5.
  • each Text.Upper("abc")—Will be represented as a single constant node reflecting the result of invoking the function (i.e. “ABC”), not as a function invocation node. (In contrast, if the expression were each Text.Upper([ID]) then this simplification could not be applied. Instead a node for the invocation with an argument node representing an ID field access would be used.
  • let A = 123 in RowExpression.From(each A)—Identifier (variable) access will be performed before the AST is output, resulting in just a single constant node of 123.

The abstract syntax trees output by RowExpression.From (a.k.a. ItemExpression.From) are focused on helping you understand the end result of what the original expression author is attempting to achieve. In the context of a scenario like building a custom connector, you’re (likely) not interested in where a particular value came from (was it hard-coded, via a variable reference, or returned as the output of a function like Number.FromText) but rather what the ultimate value actually is. The fact that these ASTs are simplified in this regard shortens the work that you otherwise would be required to do. These ASTs allow you to “cut to the chase” instead of requiring that you calculate the simplified values yourself.

Normalized “Row” References

As described in the description of node type FieldAccess, RowExpression.From normalizes any reference to the function’s single parameter (regardless of its name) to an AST node that is equal to RowExpression.Row (a.k.a. ItemExpression.Item). This normalization saves you from needing to perform extra work to track that parameter’s name.

Each of the following expressions will be represented by exactly the same AST.

each [ID]
(_) => [ID]
(x) => x[ID]
(myVeryImportantRow) => myVeryImportantRow[ID]

(Remember that each is just a shortcut for defining a single parameter function and that RowExpression.From is agnostic as to the input function’s parameter name.)

Conclusion

RowExpression.From/ItemExpression.From enables you to learn about what single-parameter functions do. A typical use case for this function is when building a custom connector: This method is applied in the appropriate query folding handler to get the details of the “row expression” defined by the user when they invoke a table function like Table.AddColumn or Table.SelectRows so that the connector can translate the user’s intention into the data source’s native query/request language.

Keep in mind that not every Power Query function that outputs ASTs may simplify or normalize them as described above. In other contexts, it may be desirable to know more about the individual steps used to come up values, so AST-outputting functions aimed at those scenarios may not perform the above simplifications or normalization.


Thanks to Jorge H. for his help in understanding some of the nuances of this function.

Leave a Reply

Your email address will not be published. Required fields are marked *