Power Query M Primer (Part 4): Variables & Identifiers

, , , ,

Up until now, we’ve used variables without talking much about their specifics. Since we’re operating under the assumption that you have at least a little experience with another programming or scripting language, how we’ve used variables has probably made sense.

However, there might be differences—perhaps even significant differences—between how variables work in other languages you’ve used with and how they behave in the Power Query M language. Let’s spend some time exploring variables and related concepts to clear up any lurking confusion and position you to take full advantage of what M offers.

We’ll start with a brief recap of the main unit where we define variables: the let expression. Then, we’ll talk about how variables (and other things) are identified. Related to identifiers is scope, so we’ll cover that, too. Next time, we’ll expand  our understanding of how variables work by learning about M’s paradigm.

Series Index

Variables

let’s Recap

First, a quick refresher: In part one, we talked about let expressions. A let expression allows a value to be computed using one or more intermediate expressions.

Instead of writing one big long complex expression, let allows you to break what you’re trying to do into intermediate steps which produce values that you assemble into the value returned by the let expression.

A variable is used for each intermediate step as well as for functions you might want to define.

1
2
3
4
5
6
7
8
let
  SalesTotal = 100 + 15 + 275 + 25,
  CommissionRate = 0.2,
  CalculateCommission = (sales, rate) => sales * rate,
  Commission = CalculateCommission(SalesTotal, CommissionRate),
  Result = Commission
in
  Result

Each variable has a name (of course!). Let’s think about those names….

Identifiers

Identifier Names

If you’ve ever glanced at the M scripts generated by tools like Microsoft Excel or Microsoft Power BI, you’ve probably noticed they name variables using a syntax different from what we’ve been using. For the most part, we written variable names like VariableName while GUI-generated scripts tend to surround them in some extra syntax, like #”VariableName”. What’s the difference?

We’ve been coding up variable names as regular identifiers. The hash-quotes style names used by GUI tools are called quoted identifiers.

Identifier Types

Regular Identifiers

A regular identifier must start with either a letter or an underscore, then can be followed by additional characters. Regular identifiers must not contain spaces, M keywords and certain other special characters.

Identifier Valid/Invalid?
SalesTotal Valid
_Total Valid
Sales Total Invalid (contains a space)
Sales/* all sales */Total Invalid (comments aren’t allowed in the middle of regular identifiers)
20PercentRate Invalid (must start with letter or underscore)

Quoted Identifiers

A quoted identifier can start with and contain any characters, including spaces. Inside the quotes surrounding a quoted identifier, syntax that normally has special meaning in M, such as keywords and comments, is interpreted as literal text that becomes a part of the identifier name. Double quotes can even be used inside quoted identifiers—they just have to be escaped by doubling them.

Identifier Valid/Invalid?
#”SalesTotal” Valid
#”Sales Total” Valid (space allowed)
#”Sales/* all sales */Total” Valid (important: /* all sales */ becomes literal text inside the variable name; it’s not treated as a comment because it is inside a quoted identifier)
#”20PercentRate” Valid
#”Possible “”Pretend”” Values” Valid (references the variable named Possible “Pretend” Values)

I’ll let you in on a secret, but please don’t try this at home: Using quoted identifiers, it’s even possible to use an empty string as an identifier!

1
2
3
4
let
  #"" = 1
in
  #""

In most programming languages (at least, the ones I’ve worked with), spaces are atypical in variable/identifier names. In M, the ability to use them comes in handy. You see, the names in the step list that GUI tools display are the names of the variables from the root let expression. Spaces in those names make the list easier to read.

No Spaces With Spaces
Step List - no spaces between words in step names Step List - with spaces between words in step names

Generalized Identifiers

A third identifier type, the generalized identifier is a cross between regular and quoted identifiers. This identifier type is only allowed between square brackets, such as when referencing a record field or column name. Since the square braces make it possible to determine where the identifier starts and end, certain characters and keywords that otherwise can only be used in quoted identifiers are allowed. For example, generalized identifiers allow spaces between words (though spaces before the first word or after the last word are ignored).

Identifier Valid/Invalid?
[First Name] Valid (#”First Name” not required because of the square brackets)
[First Name ] Valid (trailing space on field name ignored so refers to same field as the above)

Identifying Things, not Just Variables

“Whoa…wait a minute!” you might be saying. “We’ve been talking about identifying variables. Record field and column names aren’t variables!” You are correct! Identifiers identify thingsincluding variables but not just variables. What kind of thing they identify is determined by the context in which they are used. So far, for the most part, we’ve used them to identify variables; however, when used inside the context of square braces, they identify record fields or table columns. (We’ll talk more about records and tables in a later post.)

Not the Actual Thing

Identifiers identify. They’re not the thing they identify, just a references to it. When the same identifier name is represented using different syntax styles, all references refer to the same thing. For example, the regular identifier Weight and the quoted identifier #”Weight” contain the same exact name so refer to the exact same variable.

1
2
3
4
let
  Weight = 50
in
  #"Weight"

Scope

A sub-expression can access all identifiers defined in the expression containing its definition and in any parents of that expression, with one exception: out of the box, it can’t reference itself.

1
2
3
4
5
6
7
8
9
10
let
  a = 10, // this expression can reference b, c
  b =
    let
      Result = a + 25 // can reference a, c
    in
      Result,
  c = 15 // can reference a, b
in
  a + b + c // can reference a, b, c
1
2
3
4
5
[
  A = 1, // can reference B (including B[BB]), C
  B = [ BB = A ], // can reference A, C
  C = 2 // can reference A, (including B[BB])
]

As we talked about in part 3, in order for an expression to reference the identifier it’s assigned to, simply prepend an @ in front of the reference. Self-referencing is handy when defining recursive functions.

1
2
3
4
5
let
  SumConsecutive = (x) => if x <= 0 then 0 else x + @SumConsecutive(x - 1),
  Result = SumConsecutive(4)
in
  Result

Next Time

Have you noticed how M scripts define variables but never seem to change a variable’s value after it’s initially set? We should talk more about this and topics like evaluation order and lazy evaluation soon…like maybe next time.

Until then, happy M coding!

15 thoughts on “Power Query M Primer (Part 4): Variables & Identifiers

  1. Richard

    Great post, thank you!

    Can the following be rewritten to specify the table using strTable thus avoiding the need to use TableQuery?

    let
    strTable = "Root Cause",
    TableQuery = #"Root Cause",
    
    RScript = R.Execute("write.csv(dataset,""C:\\Users\\RichardLove\\Desktop\\" & strTable & ".csv"")",[dataset=TableQuery]),
    out = RScript
    in
    out
    

    Thanks again,
    Richard

    Reply
  2. Jeff Weir

    Hi Ben. Another great intro. Unfortunately I’m not understanding what you mean by any of the comments in this bit though. Can you restate it another way or further elaborate?

    In your code:

    let
      a = 10, // can reference b, c
      b =
        let
          Result = a + 25 // can reference a, c
        in
          Result,
      c = 15 // can reference a, b
    in
      a + b + c // can reference a, b, c
    
    [
      A = 1, // can reference B (including B[BB]), C
      B = [ BB = A ], // can reference A, C
      C = 2 // can reference A, (including B[BB])
    ]
    
    Reply
  3. Jeff Weir

    Ahhh…after referring to the PQ Formula Reference Spec, I see that your comments “// can reference b, c” are just referring to the parts of the environment that A can reference, and can be referenced from.

    Reply
  4. Alved

    Yesterday, I had a situation where a data source table in Excel had a space at the end of a column header.
    When I referred to that column between square brackets without the space at the end, I got an error. Would this be a situation outside of what is explained at:

    [First Name ]

    Valid (trailing space on field name ignored so refers to same field as the above)
    Thanks, Ben!

    Reply
    1. Ben Gribaudo Post author

      Leading and trailing spaces will be ignored when the text between the square brackets is converted to an identifier, so [First Name ] will be parsed as referring to field “First Name“. Since the field you’re trying to access has a space at the end of its name, this identifier (which doesn’t have the trailing space) won’t find it. Instead, you could try something like Source[#"First Name "].

      Example:

      let
          Source = #table({"First Name "}, {{"Joe"}})
      in
          // Source[ First Name ] // Expression.Error: The column 'First Name' of the table wasn't found.
          Source[#"First Name "] // works
      

      Does that help?

      Reply
  5. JJ

    Hi Ben,

    Liked your post so much! so simple to follow along.

    I understand that power query M uses EACH for looping like items in a list. But I am wondering whether power query M can simply do for loop using counter like other coding languages?

    Reply
    1. Ben Gribaudo Post author

      Hi JJ! Thanks for your kind words. No, there is no direct for loop in M. Keep in mind that the code inside a for/foreach loop typically modifies variable(s), which isn’t something M allows. Even if M supported for, it wouldn’t be useful for a loop containing a counter because the loop iterations wouldn’t be able to update the counter variable.

      Reply
  6. JP

    Awesome Ben!
    This is what I am looking for.

    BIG HELP for the newbie just like me.

    Hope you can have YT tutorial series.

    Thank you so much!

    Reply
    1. Ben Gribaudo Post author

      By “rewrite”, do you mean change, as it to modify the value of an existing variable?

      Like this (pseudo-code):

      declare x = 10
      set x = 20 (changing x so that it is now set to 20)

      If so, no. In M, variables cannot be modified once set. However, with streamed data types (table, list and binary), the variable does not hold actual row/list item/binary byte data; instead, that data is produced/streamed on demand, so can be different each time the variable is accessed.

      Reply
  7. Jorge

    Hello Ben, I am trying to run the following expression in Power Query,

    I have a text parameter called StockSymbol but I cannot make the formula to recognize it each time I change the parameter to another value. If I has a fixed column containing data about a stock [AZN] it works, but when using the parameter it collapses, hope you can give it a look please.

    ([AZN] -AddedIndex1{[Index]}[AZN])/[AZN]) // it works

    (StockSymbol -AddedIndex1{[Index]} StockSymbol)/StockSymbol // it does not work

    Thank you!

    Reply
    1. Ben Gribaudo Post author

      Hi Jorge,

      When accessing a field’s value using square brackets, the field’s name must be hard-coded; a variable reference cannot be used for the field name. So [AZN] works but something like let StockSymbol = "AZN" in [StockSymbol] doesn’t work as expected.

      If you’re using a variable to identify the record field of interest, instead of using square brackets, try using Record.Field to access the field’s value. Example: Record.Field(_, StockSymbol)

      Reply

Leave a Reply to Ben Gribaudo Cancel reply

Your email address will not be published. Required fields are marked *