Power Query M Primer (Part 11): Tables—Syntax

, , ,

At long last, tables are up!

If you’re like most users, tables are the main reason you use Power Query. You ask Power Query to pull data from one or more sources, mash it up and then return the results in a table which is then handed off to the host application (Microsoft Excel, Power BI, SSAS, SSIS, etc.).

As you might expect, with tables being of such fundamental importance, there’s plenty we could explore. A whole series could be written about the standard library’s table functions. While that would be fun to do, the focus of this series is the M language, not the library, so things related to the language are what we’ll delve into here.

The previous post in this series concluded with the thought that tables are similar in behavior to lists and records but go beyond what we’d get if we tried to build our own simulated tables using those two types. Let’s start by looking at a way in which table and list are similar….

Series Index

Positional Selection

With type list, remember how elements can be accessed by positional index using the selection operator (think: curly branches)? For example myList{2} returns the third element from myList. (Reminder: M’s indexes are zero-based, so the first element is considered to be at index 0, the second at index 1, etc.)

Type table also supports the selection operator—except with this type, the index identifies which row to return. The identified row is returned as a record, with each column represented as a field in the record. (Records are usually [always?] the way M returns single rows from tables.)

let
  Parts = #table(
    { "Code", "Category", "PriceGroup" },
    {
      { 123, "Widget", "A" },
      { 456, "Thingamajig", "B" },
      { 789, "Widget", "B" }
    }
  )
in
  Parts{1} // returns the second row as a record: [Code = 456, Category = "Thingamajig", PriceGroup = "B"]

(Above, library function #table is used to create a table. In this example, the first argument is a list of column names. The second is a list of lists which defines row values: each item in the outer list corresponds to a row; each item in the inner list corresponds with a column value for that row.)

If the requested index does not exist, the behavior is the same as with list: an error is returned but this can be changed to null by making the selection optional (a.k.a. by appending a ? after the curly braces).

Parts{3} // Expression.Error: There weren't enough elements in the enumeration to complete the operation.
Parts{3}? // null 

The standard library contains numerous functions that work with table rows based on position. Examples include: Table.SingleRow, Table.First, Table.Last, Table.Skip, Table.FirstN, Table.LastN, Table.Range, Table.PositionOf, Table.ReverseRows.

Value-Based Selection

With tables, the selection operator can also be used to select a single row based on field value(s). To do this, inside the curly braces, pass a record defining the search criteria to use. Each field name should correspond to a table column to search and each field value should correspond with the value to search for in that particular column. It’s unnecessary for the record to contain fields for every column in the table; only those you want searched need to be included.

Parts{[Code = 123]} // returns [Code = 123, Category = "Widget", PriceGroup = "A"]
Parts{[Category = "Thingamajig"]} // [Code = 456, Category = "Thingamajig", PriceGroup = "B"]
Parts{[Category = "Widget", PriceGroup = "B"]} // [Code = 789, Category = "Widget", PriceGroup = "B"]

The selection operator returns, at most, one row, represented as a record. If the search criteria match more than a single row, an error is returned. Using an optional selection (by appending the question-mark, like {[…]}?) does not suppress the error. Optional selection only affects behavior when no match is found, not when multiple matches are found.

Parts{[Category = "Other"]} // returns Expression.Error: The key didn't match any rows in the table.
Parts{[Category = "Other" ]}? // returns null
	
Parts{[Category = "Widget"]} // Expression.Error: The key matched more than one row in the table.
Parts{[Category = "Widget"]}? // same error as above (using ? doesn't suppress it)

Searching is performed using the criteria record’s values. If you’d rather dynamically decide which row to return, you’ll need to use an expression—which is possible using the standard library (hint: try something along the lines of Table.SingleRow(Table.SelectRows(Parts, each ...some expression...))).

Column Access (a.k.a. Field Selection)

Like record, type table also supports field selection. With a record, the field name of interest goes between the square brackets (like: myRecord[First Name]) and that field’s value is returned. When field selection is performed on a table, the column name of interest goes between the square brackets (like: myTable[Some Column])…but since a column can contain a whole list of values, what’s returned? Yup, you guessed it: a list!

The list that’s returned contains the value from each row for the specified column, in the order those values appear in that column (so the first item in the list will correspond with the value of the column from the first row, the second item with the value from the second row, etc.).

Parts[Category] // returns { "Widget", "Thingamajig", "Widget" }

If you would, pause for a moment and consider the ramifications of this behavior. By returning a list, M allows you to work with the set of values in a column using the same functions you’d use on any other list. M’s standard library doesn’t need to provide a separate set of table functions for single-column operations (like: Table.DistinctValuesInColumn(Parts, "PriceGroup")) because it can work with a single column as a list (like List.Distinct(Parts[PriceGroup]))). A column is a list—at least as far as list functions are concerned! What beautiful simplicity!

Projection

Perhaps you have a table where you only want to work with (or output) some of the columns. Like record, table supports projection. In the case of tables, projection produces a new table that contains a reduced set of columns. The projection operator shares square bracket syntax with field/column selection.

Parts[[Code], [Category]] // returns a two-column table consisting of Code & Category

If a specified column does not exist, an error is returned; however, if the projection is made optional (by appending a question-mark to the square brackets), the non-existent column(s) will be included in the new table with values set to null.

Parts[[Code], [Price]] // Expression.Error: The column 'Price' of the table wasn't found.
Parts[[Code], [Price]]? // returns a two-column table, where Price's values are all null (since that column does not exist in the source)

Column names must be hard-coded. If you’d rather provide the column names as a list, library function Table.SelectColumns is your friend.

Comparison

If two tables contain the same number of columns, with the same names, and the same number of rows, where the same-name columns from each table have the same values in each position (e.g. Col1, index 0 in table 1 equals Col1, index 0 in table 2), the equality operator (=) considers the tables equal. Otherwise (obviously!), they’re not equal (<>).

#table({"Col1"}, {{1}, {2}}) = #table({"Col1"}, {{1}, {2}}) // true
#table({"Col1"}, {{1}, {2}}) <> #table({"Col1"}, {{1}, {2}}) // false

#table({"Col1", "Col2"}, {{"A", "B"}, {1,2}, {3,4}}) = #table({"Col2", "Col1"}, {{"B", "A"}, {2,1}, {4,3}}) // true--even though columns are in different positions, they contain the same values at the same positions, so the tables are equivalent
	
#table({"Col1"}, {{1}, {2}}) = #table({"Col1"}, {{2}, {1}}) // false--even though the tables contain the same rows, they are ordered differently, so the tables are not equivalent

Column schema and metadata do not have to match in order for tables to be considered equal.

let
  Table1 = Table.TransformColumnTypes(#table({"Col1"}, {{1}}),{{"Col1", type any}}),
  Table2 = Table.TransformColumnTypes(#table({"Col1"}, {{1}}),{{"Col1", type number}})
in
   Table1 = Table2 // returns true even though the column types differ (any vs number)

You could think of it this way: M considers two tables to be equal if they contain the same data. Things like column order, schema details and metadata are accessory items, not data proper, so are not included in the equality comparison. (For example, you can get the exact same data out of a table regardless of how its columns are positioned, so column order isn’t factored in to the “are these tables equal?” decision.)

Combining Tables

Two tables can be combined using the combination (&) operator.

#table({"City", "State"}, {{"Chicago", "IL"}}) &amp; #table({"City", "State"}, {{"Washington", "DC"}})

Columns are paired by name, not position. If a column exists in only one of the tables, it’s values will be set to null for rows from the other table.

#table({"City"}, {{"Chicago"}}) &amp; #table({"State"}, {{"DC"}})

In the table that’s returned, all columns from the first table are outputted first, positioned based on the order in which they appear in that table (e.g. first column from that table is outputted first, second comes second, etc.). Then, any columns that are only in the second table are outputted, positioned based on where they appear in that table (e.g. first column exclusive to that table is outputted, followed by the second exclusive column, etc.).

#table({"City", "State"}, {{"Chicago", "IL"}}) &amp; #table({"State", "City"}, {{"Washington", "DC"}})

M does not require that column data types be compatible. The resulting table from the below contains a string value and a numeric value.

#table({"Age"}, {{18}}) &amp; #table({"Age"}, {{"21"}})

These last three behaviors are a bit different from SQL’s UNION ALL operator. With SQL, combining two tables requires that the tables contain the same number of columns, columns are combined based on position (not name) and columns being combined must have compatible data types. With Power Query, tables with a dissimilar number of columns may be concatenated together, columns are paired by name (not position) and column data types are irrelevant for combination purposes.

Next Time

At the language syntax level, that’s it—there you have tables. As was already mentioned, the standard library contains many functions around tables—but since that’s outside the context of this series on the Power Query language, we’re done as far as tables go, right?

Not so fast! There’s syntax, which we just covered, and then there’s the environment in which that syntax is executed. In essence, how Power Query processes tables. Streaming, query folding, buffering, firewalling…all of these can have a major impact on the performance and/or behavior of how M processes your tables.

We’ve already talked about streaming and query folding (back in Paradigm [part 5]). Maybe next time, we should discuss the other two areas: buffers and firewalls.

Until then, happy coding!

4 thoughts on “Power Query M Primer (Part 11): Tables—Syntax

  1. Joe

    Thanks, Ben. Great posts! Just have read the whole series from part 1 thru 11 again in these two days. Look forward to the next one!

    Reply
  2. fred

    Hello,
    very good series of articles. Although I know Power Query well, I still learnt some useful tricks. However, I didn’t see anything about “try ... otherwise ... ” nor about error handling. Do you plan to make an article about this topic ?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *