Power Query M Primer (Part 4): Variables & Identifiers

, , , ,

Up until now, we’ve used variables without talking much about their specifics. Since we’re operating under the assumption that you have at least a little experience with another programming or scripting language, how we’ve used variables has probably made sense.

However, there might be differences—perhaps even significant differences—between how variables work in other languages you’ve used with and how they behave in the Power Query M language. Let’s spend some time exploring variables and related concepts to clear up any lurking confusion and position you to take full advantage of what M offers.

We’ll start with a brief recap of the main unit where we define variables: the let expression. Then, we’ll talk about how variables (and other things) are identified. Related to identifiers is scope, so we’ll cover that, too. Next time, we’ll expand  our understanding of how variables work by learning about M’s paradigm.

Series Index

Variables

let’s Recap

First, a quick refresher: In part one, we talked about let expressions. A let expression allows a value to be computed using one or more intermediate expressions.

Instead of writing one big long complex expression, let allows you to break what you’re trying to do into intermediate steps which produce values that you assemble into the value returned by the let expression.

A variable is used for each intermediate step as well as for functions you might want to define.

let
  SalesTotal = 100 + 15 + 275 + 25,
  CommissionRate = 0.2,
  CalculateCommission = (sales, rate) => sales * rate,
  Commission = CalculateCommission(SalesTotal, CommissionRate),
  Result = Commission
in
  Result

Each variable has a name (of course!). Let’s think about those names….

Identifiers

Identifier Names

If you’ve ever glanced at the M scripts generated by tools like Microsoft Excel or Microsoft Power BI, you’ve probably noticed they name variables using a syntax different from what we’ve been using. For the most part, we written variable names like VariableName while GUI-generated scripts tend to surround them in some extra syntax, like #”VariableName”. What’s the difference?

We’ve been coding up variable names as regular identifiers. The hash-quotes style names used by GUI tools are called quoted identifiers.

Identifier Types

Regular Identifiers

A regular identifier must start with either a letter or an underscore, then can be followed by additional characters. Regular identifiers must not contain spaces, M keywords and certain other special characters.

Identifier Valid/Invalid?
SalesTotal Valid
_Total Valid
Sales Total Invalid (contains a space)
Sales/* all sales */Total Invalid (comments aren’t allowed in the middle of regular identifiers)
20PercentRate Invalid (must start with letter or underscore)

Quoted Identifiers

A quoted identifier can start with and contain any characters, including spaces. Inside the quotes surrounding a quoted identifier, syntax that normally has special meaning in M, such as keywords and comments, is interpreted as literal text that becomes a part of the identifier name. Double quotes can even be used inside quoted identifiers—they just have to be escaped by doubling them.

Identifier Valid/Invalid?
#”SalesTotal” Valid
#”Sales Total” Valid (space allowed)
#”Sales/* all sales */Total” Valid (important: /* all sales */ becomes literal text inside the variable name; it’s not treated as a comment because it is inside a quoted identifier)
#”20PercentRate” Valid
#”Possible “”Pretend”” Values” Valid (references the variable named Possible “Pretend” Values)

I’ll let you in on a secret, but please don’t try this at home: Using quoted identifiers, it’s even possible to use an empty string as an identifier!

let
  #"" = 1
in
  #""

In most programming languages (at least, the ones I’ve worked with), spaces are atypical in variable/identifier names. In M, the ability to use them comes in handy. You see, the names in the step list that GUI tools display are the names of the variables from the root let expression. Spaces in those names make the list easier to read.

No Spaces With Spaces
Step List - no spaces between words in step names Step List - with spaces between words in step names

Generalized Identifiers

A third identifier type, the generalized identifier is a cross between regular and quoted identifiers. This identifier type is only allowed between square brackets, such as when referencing a record field or column name. Since the square braces make it possible to determine where the identifier starts and end, certain characters and keywords that otherwise can only be used in quoted identifiers are allowed. For example, generalized identifiers allow spaces between words (though spaces before the first word or after the last word are ignored).

Identifier Valid/Invalid?
[First Name] Valid (#”First Name” not required because of the square brackets)
[First Name ] Valid (trailing space on field name ignored so refers to same field as the above)

Identifying Things, not Just Variables

“Whoa…wait a minute!” you might be saying. “We’ve been talking about identifying variables. Record field and column names aren’t variables!” You are correct! Identifiers identify thingsincluding variables but not just variables. What kind of thing they identify is determined by the context in which they are used. So far, for the most part, we’ve used them to identify variables; however, when used inside the context of square braces, they identify record fields or table columns. (We’ll talk more about records and tables in a later post.)

Not the Actual Thing

Identifiers identify. They’re not the thing they identify, just a references to it. When the same identifier name is represented using different syntax styles, all references refer to the same thing. For example, the regular identifier Weight and the quoted identifier #”Weight” contain the same exact name so refer to the exact same variable.

let
  Weight = 50
in
  #"Weight"

Scope

A sub-expression can access all identifiers defined in the expression containing its definition and in any parents of that expression, with one exception: out of the box, it can’t reference itself.

let
  a = 10, // can reference b, c
  b =
    let
      Result = a + 25 // can reference a, c
    in
      Result,
  c = 15 // can reference a, b
in
  a + b + c // can reference a, b, c
[
  A = 1, // can reference B (including B[BB]), C
  B = [ BB = A ], // can reference A, C
  C = 2 // can reference A, (including B[BB])
]

As we talked about in part 3, in order for an expression to reference the identifier it’s assigned to, simply prepend an @ in front of the reference. Self-referencing is handy when defining recursive functions.

let
  SumConsecutive = (x) => if x <= 0 then 0 else x + @SumConsecutive(x - 1),
  Result = SumConsecutive(4)
in
  Result

Next Time

Have you noticed how M scripts define variables but never seem to change a variable’s value after it’s initially set? We should talk more about this and topics like evaluation order and lazy evolution soon…like maybe next time.

Until then, happy M coding!

Leave a Reply

Your email address will not be published. Required fields are marked *