Power Query M Primer (Part 10): Types—List, Record

, , , ,

You might be inclined to skip learning about list and record and jump directly to table. After all, as far as types that hold other values go, you mostly work with table. A table’s data is made up of simple values, so it makes since that you learned about them (parts 6, 7, 8 and 9). However, list and record might feel like types you won’t normally use, so why not skip them?

Turns out, a table behaves in ways like both a list and a record. If you want to leverage all that table offers, learning about these types is foundational. Besides, as you work with M, you just might find that list and record are more useful in their own right than you first thought.

Oh, did I mention that there’s a surprising relationship between let expressions and type record? All the more reason not to skip over it!

Series Index

list

Type list stores exactly what its name implies: a list of values.

{ 1, 2, 5, 10 }
{ "hello", "hi", "good bye" }

As you might expect, a list can be empty:

{ }

Out of the box, a list’s values don’t need to be of the same type.

{ "hello", 1, 6, true, #date(2018, 5, 6) }

(It’s possible to create lists that are constrained to only hold values of a certain type, but that’s an advanced scenario which requires knowledge of M’s type system, so we won’t cover it here.)

Since a list can contain values and a list is itself a value, a list can contain lists which in turn can contain lists and so forth.

{ { "first name", "last name", "children" }, { "Sue", "Smith", { "Sarah", "Sharon" } } }

If you’re interested in an increasing sequential list of whole numbers, .. is a handy shortcut, saving you from needing to literally type out the entire list.

{ 1..5 } // equivalent to { 1, 2, 3, 4, 5 }
{ -2..0 } // equivalent to { -2, -1, 0 }
{1, 6..8, 15 } // equivalent to {1 , 6, 7, 8, 15 }

This shortcut only works to produce lists of increasing values. Attempting to use it to produce a sequence that counts down will leave you with an empty list.

{ 5..1 } // { } -- outputs an empty list

Multiple lists can be concatenated using the combination operator (&).

{ 1, 2 } & {3, 4 } // { 1, 2, 3, 4 }
{ 1, 2 } & { 2, 3} // { 1, 2, 2, 3 }

Two lists are equal if they contain the same values in the same order. Not equals works just as you’d expect (exactly the opposite of equals).

{ 1, 2 } = { 1, 2} // true
{ 1, 2 } = { 1, 2, 3} // false
{ 4, 5 } = { 5, 4 } // false -- same values but different order
{ 2, 4 } = { 2, 4 } // true
{ 2, 4 } <> { 2, 4 } // false

Greater than (>) and less than (<) comparisons are not supported on lists.

Item Access

List items can be accessed using the positional index operator. Simply take a reference to the list of interest and append the index of the desired list item surrounded by curly braces:  SomeList{SomeIndex}.

In M, list indexes (or indices, if you prefer) are 0-based, meaning that the first list item is at index 0, the second item is at index 1, and so forth. So, to access the first item, use an index of zero.

Assuming Values = { 10, 20, 30 }, the following expressions produce the indicated output:

Values{0} // 10 -- value of 1st item in list
Values{1} // 20 -- value of 2nd item in list
Values{2} // 30 -- value of 3rd item in list

If an attempt is made to access an index larger than what’s in the list, an error is returned.

Values{3} // Expression.Error - There weren't enough elements in the enumeration to complete the operation.

If, instead, you’d rather receive a null when a too large index is used, append a ? to the curly braces.

Values{3}? // null

Lazy Evaluation

Lists are evaluated lazily. The below list doesn’t produce an error, even though the value of an item in it is defined as an expression that raises an error. We didn’t asked for the value of that item, so no attempt was made to generates its value. Since the error-raising expression was never invoked, no error was raised.

let
  Data = { 1, 2, error "help", 10, 20 }
in
  List.Count(Data) // 5

All we asked was “how many items are in the list?,” and that’s all M figured out. Whether or not they are all valid is a different question, one we didn’t ask and one M’s mashup engine didn’t try to answer.

Also, when values are needed, M’s laziness means that it only evaluates as many list items as are necessary to produce the requested output. Using Data from the above example, the following expressions do not raise errors. Neither needs the value of index 2, so the error that would be raised if that item’s expression were evaluated isn’t raised.

List.Sum(List.FirstN(Data, 2)) // 3 -- sum up just the first two items
List.Sum(List.LastN(Data, 2)) // 30 -- sum up just the last two items

Library Highlights

As you might expect, the standard library includes a number of methods for working with lists. They cover tasks from counting items to searching for text, from doing math on a list (sum, product, etc.) to transforming a list (e.g. remove items, replace items, reverse, etc.), from generating statistics (e.g. average, max, standard deviation) to testing membership (like “does it contain this value?” or “do all values in the list cause the provided function to return true?”), as well as supporting set operations (union, intersect, difference, zip) and sorting. There’s even a family of functions for generating lists of values of other types (handy, say, if you want a sequential list of datetimes or durations or maybe a list of random numbers, etc.).

record

A record allows a set of named fields to be grouped into a unit.

[ FirstName = "Joe", LastName = "Smith", Birthdate = #date(2010, 1, 2) ]

Technically, a record preserves the order of its fields. However, as we’ll see in a moment, field order isn’t considered when comparing records, so mostly this preservation of field order is a convenience for humans (e.g. fields will be output on screen in the same order you defined them, making it easier for you to visually locate data items of interest).

An empty record contains no fields (no surprise!).

[ ]

Equality is determined by field name and value. Field position is not considered.

[ a = 1, b = 2] = [a = 1, b = 2] // true
[ a = 1, b = 2] = [b = 2, a = 1 ] // true -- same field names and values, even though ordering is different
 
[ a = 1 ] = [ A = 1 ] // false -- different field names
[ a = 1 ] = [ a = 2 ] // false -- same filed names but different values
[ a = 1 ] <> [ A = 1 ] // true

Records can be merged with the combination operator.

[ a = 1 ] & [ b = 2 ] // [ a = 1, b = 2]

If the same field name is present in both merge inputs, the value associated with the field from the on the right is used.

[ a = 1 ] & [ a = 10 ] // [ a = 10 ]

Field Access

Remember how lists use {index} to access list items? With records, something similar is used—the lookup operator, which consists of the field name inside square brackets: SomeList[SomeField]

If Value = [ Part = 1355, Description = "Widget", Price = 10.29, Internal Cost = 8.50 ] then the following expressions will return the noted values:

Value[Part] // 1355
Value[Description] // "Widget"
Value[Price] // 10.29
Value[Internal Cost] // 8.50

Similar to list, appending a ? to the lookup operator changes its not-found behavior from an error to returning null (technically, this is called “performing an optional field selection”).

Value[NonExistentField]  // error - Expression.Error: The filed 'NonExistentField' of the record wasn’t found.
Value[NonExistentField]? // null

Within a record, the expression for a field value can reference other fields.

[
  FirstName = "Sarah",
  LastName = "Smith",
  FullName = FirstName & " " & LastName
]

A field’s expression can even reference itself if its name is proceeded by the scoping operator (@).

“Why would a field want to reference itself?” you might ask. This behavior may not seem intuitive in the context of a field containing a data value. However, the ability to self-reference comes in handy when the value is a function because it allows the function to be recursive.

[
  AddOne = (x) => if x > 0 then 1 + @AddOne(x - 1) else 0,
 AddOneThreeTimes = AddOne(3)
][AddOneThreeTimes] // 3

Projection

In addition to square brackets being used to select record fields, they can also be used to perform record projection—that is, reshaping a record to contain fewer fields. Below are a couple examples (assume that Source = [ FieldA = 10, FieldB = 20, FieldC = 30 ]):

Source[[FieldA], [FieldB]] //  [ FieldA = 10, FieldB = 20 ] -- FieldC was removed
Source[[FieldC]] // [ FieldC = 30 ] -- all fields except C were removed

Similar to when [] are used for field selection, with projection, referencing a non-existent field causes an error. However, if a ? is appended, any non-existent fields referenced by the projection expression will be added to the output with their values set to null.

Source[[FieldA], [FieldD]] // error - Expression.Error - The field 'FieldD' of the record wasn't found.
Source[[FieldA], [FieldD]]? // [ FieldA = 10, FieldD = null]

Relaxed Quoting Rules

Within the square bracket syntax of record initialization, selection and projection, field names have relaxed quoting rules. Normally, to start an identifier with a number, to use a M language keyword as an identifier or to include whitespace in the middle of an identifier, the identifier must be quoted. However, in these situations quoting is optional for field name identifiers inside square brackets.

Outside of square brackets, the identifier Street Address needs quoting because it contains a space and try needs quoting because it’s a keyword. Below, inside the square brackets, quoting these identifiers is optional:

[#"try" = true, #"Street Address" = "123 Main St."]
[try = true, Street Address = "123 Main St."] // identical in effect to the preceding
 
SomeRecord[#"Street Address"]
SomeRecord[Street Address] // identical in effect to the preceding
 
SomeRecord[#"try"]
SomeRecord[try] // identical in effect to the preceding

Note, however, that M assumes whitespace occurring at the start or end of an unquoted field name can be ignored and so excludes it from the field name. If, for some reason, you want leading or trailing whitespace to be a part of a field name, you’ll need to quote it.

SomeRecord[ Street Address ] // accesses field "Street Address"
SomeRecord[#" Street Address "] // accesses field " Street Address "

Lazy Evaluation & Value Fixing

Like list, record is lazy. If a value isn’t needed, it isn’t evaluated.

[ Price = 10, Quantity = error "help"][Price] // 10

Above, since Quantity wasn’t needed, its value wasn’t evaluated. Since its value wasn’t evaluated, no error was raised.

When a field is evaluated for the first time, the resulting value or error is fixed as the value for that field. The field’s expression is only executed one. Its output is then cached. The cached value or error is returned each subsequent time the field is accessed.

[ Price = GetValueFromRemoteServer() ]

Imagine that the first time Price is accessed, the remote server returns 10. Later on while your mashup is still running, the record’s Price field is accessed again. Perhaps by this point in time, invoking GetValueFromRemoteServer() would return 11. However, that method is not re-executed. Instead, the value cached when the field was first accessed (10) is returned.

If, instead, when Price was first accessed, GetValueFromRemoteServer() raised an error due to a temporary communications glitch, that same error will be re-raised each subsequent time Price is accessed, even if by the time the subsequent access occurs, the glitch is resolved and GetValueFromRemoteServer() would return a non-error value if it were invoked.

This value fixing (or caching) provides consistency. Thanks to it, you know that a field’s value will always be the same throughout your mashup’s execution.

Value caching is not shared across record instances, even if the records have identical fields and field value expressions. If your code causes the record [ Price = GetValueFromRemoteServer() ] to be generated twice and Price is accessed on both instances, each will separately invoke GetValueFromRemoteServer() once and cache the value returned. If the value returned is different between the two invocations, the two records will have different values for Price.

If the record you are working with is assigned to a variable, each time you access that variable, you’ll access the same record instance. However, if instead you access a record multiple times by invoking an expression that retrieves it from an external source (e.g. a database or web service), each retrieval may technically return a different record instance. If it’s important to be sure that you are always working with the same record instance, retrieve it once then save it to a variable or, in the case of a list of records, buffer the list.

Library Highlights

In the standard library, you’ll find several functions for working with records, including methods to add, rename, reorder and remove fields as well as to transform field values. There is also a method returning a list of the record’s field names (with field order preserved) and a similar method returning field values.

Dynamic Operations

Above, we used the lookup operator to access field values by hard-coded names. What if, instead, we wanted to use programmatic logic to choose the field to access? The following doesn’t work because field names inside square brackets must be strings; variable references aren’t allowed.

let
  Item = [Name = "Widget", Wholesale Price = 5, Retail Price = 10],
  PriceToUse = "Wholesale Price"
in
  Item[PriceToUse] // doesn’t work--doesn’t return the value of field "Wholesale Price"

To solve this dilemma, the standard library comes to the rescue. Record.Field is the dynamic equivalent of the lookup operator. Record.FieldOrDefault works like a dynamic lookup operator followed by a question mark, with the added bonus of optionally allowing you to specify the value to be returned if the field name doesn’t exist

Record.Field(Item, PriceToUse) // returns 5

If, instead, PriceToUse is set to “Sale Price” (a value which doesn’t correspond with a fieldname), then:

Record.Field(Item, PriceToUse) // error - Expression.Error: The field 'Sale Price' of the record wasn’t found.
Record.FieldOrDefault(Item, PriceToUse) // returns null
Record.FieldOrDefault(Item, PriceToUse, 0) // returns 0

Similarly, if we want to dynamically perform projection, Record.SelectFields is our go-to. There are also standard library functions to remove fields (instead of projecting by listing the fields desired, specify the undesired fields and a new record containing all of the other fields will be returned) and to reorder fields (handy in those few cases where field order matters).

let Sugar

Ready for a surprise? A let expression is, in essence, syntactic sugar for an implicit record expression.

let
  A = 1,
  B = 2,
  Result = A + B
in
  Result

Is equivalent to:

[
  A = 1,
  B = 2,
  Result = A + B
][Result]

It might be good to pause and ponder this for a moment. This fact means that what we know about how records work also applies to let expressions and vice versa.

For example, we know that a record field’s value is computed on first access then cached. Since let is in essence a record expression, this same immutability rule applies to it: a let variable’s expression will be evaluated on first access then its value will be cached. However, for let expressions, we know there’s one exception to immutability, which occurs when streaming comes into play. This same exception must also apply to records…it must because let and record share the same behavior.

Next Time

Did you notice that record looks like it would be great for holding a row of data? If you wanted to store multiple records, each representing a row of data, in a single variable, you could put those records in a list. Hm…sounds like we’re getting close to what a table is!

Good news: tables are up next time! We’ll learn about how they share behavior with both list and record while at the same time offering more than we’d get if we tried to simulate a table using a list of records.

Looking forward to it! Until then, happy M coding!

26 thoughts on “Power Query M Primer (Part 10): Types—List, Record

  1. Daniel Herce

    Hi Ben
    You’re doing a great job in your articles explaining the “M” language fundamentals. Waiting for the next one!
    Daniel

    Reply
  2. Steve

    Tremendous series – wish I’d discovered this weeks ago!
    Really appreciate the work you’ve put into this and looking forward to further instalments.

    Reply
  3. Glenn Cogar

    Hi Ben,

    thanks for the great info. I have a question about something I’m confused about in Power BI w.r.t. ‘lists’.

    I have a Power BI table and 2 of the columns have ‘List’ as their row content. Clicking on any cell in a row displays the list of values in that cell for that row.

    I have been trying to compare the content of 2 lists using the M statement (and creating this as a new custom column)

    New column = List.Intersect([column1] as list, [column 2] as list)

    but it gives the following error

    Expression.Error: The specified comparer doesn't have the correct format.
    Details:
        List

    I thought I was working with 2 lists but maybe they are not lists according to M.

    Do you know how I create a list from the values in a column? all the examples I have seen just use hard coded values e.g.

    List.Intersect({ {0..3}, {1..4} })
    Reply
    1. Ben Gribaudo Post author

      Hi Glen,

      You are close! 🙂 List.Intersect takes a list containing the lists to intersect as its first argument. To use it, wrap the references to your two columns (which each evaluate to a list) inside a list—something like List.Intersect({ [column1], [column2] })).

      Example:

      let
          Source = #table({ "column1", "column2" }, { { {1,2}, {2,3} }}),
          #"Added Intersection" = Table.AddColumn(Source, "Intersection", each List.Intersect({ [column1], [column2] }))
      in
          #"Added Intersection"

      Hope this helps! Happy coding!

      Reply
  4. Walter

    Hi, Ben. Please, I have a problem.
    I am just now using Power BI, and I have a problem with the M Language.
    There is a query in which I added a column. In this column I want to add the data found in another table (TableA), in a given field (FieldB) with the Index = 0 (this is another field in this last table).
    I was trying to bring that data with:
    = Record.Field (TableA [Index] = 0, “FieldB”) but I get an error.
    What I can do?
    Excellent explanation!!!
    (Thank you very much and sorry for my bad English).

    Reply
    1. Ben Gribaudo Post author

      Hello Walter!

      Thanks for the question! Would defining the new column something like Table.SingleRow(Table.SelectRows(TableA, each [Index] = 0))[FieldB] do what your looking for? (Or, if the row matching [Index] = 0 is also physically at index 0 in the other table: Table.First(TableA)[FieldB].)

      Ben

      Reply
  5. sam

    Hi Ben,

    Thanks for this information.

    I have used the Number.From.. to get list of dates from 2014 till date in Power BI and I am now having problem as Power BI says this is too large or no space for all list. Is there any other way around this? I dont really need this record for everyday as my analysis is monthly based.

    Please help,
    Sam

    Reply
    1. Ben Gribaudo Post author

      Hi Sam,

      Does something like this help?

      let
          Start = #date(2014, 1, 1),
          Result = List.Dates(Start, Duration.Days(Date.From(DateTime.FixedLocalNow()) - Start), #duration(1, 0, 0, 0))
      in
          Result
      Reply
  6. theo

    Thanks Ben on a very nice blog.
    Im trying to use list to apply to my data problem and come across your blog among other contents in the internet but still was not able to resolve it. Basically i have list of attempt code in at least 6 columns (for now), and i need to have this compared to all other code per line to check how many combination have been used 6 times, 5 times, etc.

    sample data:

    Att1 Att2 Att3 Att4 Att5 Att6
    A    B    C    D    E    F
    A    B    C    D    E    H
    A    K    L    O    Q    S
    

    result: line 1: equals 2 on 5 duplicates (line 1 and line 2); 0 on other duplicates
    line 2: same as line
    line 3: 3 on 1 duplicate (counting line 1, 2 and 3); 0 on other duplicates

    is this a possibility?

    another option Im thinking, is create the possible 6, 5, etc combinations and count against each other. the problem is similar as i need to merge the columns of the combination, then count per row.
    also, the one i was able to do using unpivoting, grouping by and sum resulted in a very slow calculation and memory error.

    Thanks for checking.

    Reply
    1. Ben Gribaudo Post author

      Let’s see: Are you trying to compute, for a given cell, the number rows where the current column contains the same value as the current cell (so for the first row, column Attr3, the result would be 2, because that cell’s value [“C”] was used twice throughout Attr3) and then, for each row, count the number of times the just-computed value is greater than 1 (in the case of the first row, 5 cells contain values that are used multiple times in their respective columns, the count would be 5)?

      Reply
  7. Tom

    Hi Ben,

    Great blog, looking forward to reading your next post.
    Do you think you could do a post on Parameters, or point me in the right direction of a website that does a good explanation on how parameters work?

    Kind regards,
    Tom

    Reply
  8. David

    Hi Ben,

    I have thoroughly enjoyed the series. Thank you for all the effort to put it together.

    Here’s my challenge. I am trying to create a column (ListofNonNulls) that contains a list of the fields (for each row) which have non-null values .

    Mapping Table

    Account   Prop1   Prop2   Prop3   ListofNonNulls
    1         x       y       z       Prop1, Prop2, Prop3
    2         x       null    c       Prop1, Prop3
    3         a       null    null    Prop1
    ...
    

    The unique values in ListofNonNulls will be used in a related problem, for which I do not believe there is an elegant solution. I must join two tables where MappingTable (above) is the right table in a LeftOuter join that is used to map Transactions (in the left table) to various Accounts (in the right table). For some Accounts, certain property values are irrelevant and do not need to be part of the matching process for each Transaction; the property values exist in the Transactions and there could be one of dozens or hundreds of possible values, but their values are not relevant to the matching except in a few cases.

    I have read about grain matching which would require blowing out the Mapping Table to include every possible value for every field, but this is not practical as there are 20 fields and some have hundreds of possible values.

    I don’t believe there is a way to perform the join in one step. I think I have to perform multiple joins, starting with the least number of properties and then merging the results with that of subsequent joints which include more and more properties. I would have a separate join for each unique list of properties in the ListofNonNulls above.

    Reply
    1. Ben Gribaudo Post author

      Here’s my challenge. I am trying to create a column (ListofNonNulls) that contains a list of the fields (for each row) which have non-null values .

      Is something like this what you’re looking for?

      let
          Source = #table(
              {"Account", "Prop1", "Prop2", "Prop3"}, 
              {
                  {1, "x", "y", "z" },
                  {2, "x", null, "c" },
                  {3, "a", null, null}
              }
          ),
          PropertyFields = {"Prop1","Prop2", "Prop3"},
          ConcatenatedPropertyNameString = (Row, FieldsToCheck) => 
              let 
                  FieldsWithValues = Table.SelectRows(Record.ToTable(Record.SelectFields(Row, FieldsToCheck)), each [Value] <> null)[Name],
                  Result = Text.Combine(FieldsWithValues, ", ")
              in
                  Result,
          #"Added ListofNonNulls" = Table.AddColumn(Source, "ListofNonNulls", each ConcatenatedPropertyNameString(_, PropertyFields))
      in
          #"Added ListofNonNulls"
      

      In regards to the related problem, approximately how many accounts are there (dozens, hundreds, thousands, etc.)?

      Reply
  9. Billy

    Why, oh why, did they use the [] brackets for representing the bounds of a record and distinguising the name of a record field. The [……][Name] syntax is confusing.

    Reply
  10. Jean

    Hi, thanks for the post, very helpful. I found it looking for some explanation to the notation :

    Excel.CurrentWorkbook(){[Name="name"]}[Content]

    it’s a type of projection but I’d like to know its limits and possibilities. Do you have a related post or know the right reference or even, care to explain ?

    Best regards,

    Jean

    Reply
  11. Joe

    Hi dear ben, very nice posts on M language which seem to be found no where else in the internet space.
    Can’t wait (literally) to see your next posts on Tables and what’s more.

    Please come back from star trekking and continue to fee us with civilization again.

    A fan from China.

    Reply
  12. Chuck

    Hi Ben,

    First time visiting your website and was extremely lucky to fall into your website as I was wondering if somewhere on the Net one could find a tutorial regarding the M Query Language.
    I really enjoyed going through it step by step and applying in parallel into Power Bi Desktop the different samples you have integrated into.

    Very informative, great work! Thank you again for sharing your knowledge with other.

    Cheers!

    Reply
  13. Walter Pelowski

    Do you know of a way to specify datatypes in the bracketed record syntax? When I add a list of records as a new table column and then expand the list via Table.ExpandListColumn and records via Table.ExpandRecordColumn no types for the new columns exist. However, I can use a Table.CombineColumnsToRecord function to combine several typed columns together in a single record then subsequently expand the record without losing any data types. Is the typing being maintained at the table level at that point or is there a way to specify types within a record as well? I’m trying to have as few duplications of definitions as possible and being able to type the columns in the record itself seems like a better solution but I can’t seem to find if it’s possible.

    Reply
    1. Ben Gribaudo Post author

      Hi Walter,

      Eliminating duplication is a great idea! To do it in situations like this, you’ll need to modify the parent table’s type to include type details for its nested columns.

      When a column is expanded using Table.ExpandListColumn or Table.ExpandRecordColumn, the new columns that are output are typed based on details given in the parent table’s type. The types of the values in the column being expanded aren’t factored in (and this is a good thing, as it ensures that the new columns’ types are set consistently, even when no rows are present in the parent table).

      For example:

      let
          Source = #table(
              type table [A=[InnerA=text], B=table [InnerB=number]], {
                  {
                      [InnerA="abc"],
                      #table({"InnerB"}, {{1}})
                  }
              }),
          #"Expanded A" = Table.ExpandRecordColumn(Source, "A", {"InnerA"}, {"A.InnerA"}), // new column will be typed "text"
          #"Expanded B" = Table.ExpandTableColumn(#"Expanded A", "B", {"InnerB"}, {"B.InnerB"}) // new column will be typed "number"
      in
          #"Expanded B"
      
      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *