At long last, tables are up!
If you’re like most users, tables are the main reason you use Power Query. You ask Power Query to pull data from one or more sources, mash it up and then return the results in a table which is then handed off to the host application (Microsoft Excel, Power BI, SSAS, SSIS, etc.).
As you might expect, with tables being of such fundamental importance, there’s plenty we could explore. A whole series could be written about the standard library’s table functions. While that would be fun to do, the focus of this series is the M language, not the library, so things related to the language are what we’ll delve into here.
The previous post in this series concluded with the thought that tables are similar in behavior to lists and records but go beyond what we’d get if we tried to build our own simulated tables using those two types. Let’s start by looking at a way in which table and list are similar….
Series Index
- Introduction, Simple Expressions &
let
(part 1) - Functions: Defining (part 2)
- Functions: Function Values, Passing, Returning, Defining Inline, Recursion (part 3)
- Variables & Identifiers (part 4)
- Paradigm (part 5)
- Types—Intro & Text (Strings) (part 6)
- Types—Numbers (part 7)
- Types—The Temporal Family (part 8)
- Types—Logical, Null, Binary (part 9)
- Types—List, Record (part 10)
- Tables—Syntax (part 11) [this post]
- Tables—Table Think I (part 12)
- Tables—Table Think II (part 13)
- Control Structure (part 14)
- Error Handling (part 15)
- Type System I – Basics (part 16)
- Type System II – Facets (part 17)
- Type System III – Custom Types (part 18)
- Type System IV – Ascription, Conformance and Equality’s “Strange” Behaviors (part 19)
- Metadata (part 20)
- Identifier Scope & Sections (part 21)
- Identifier Scope II – Controlling the Global Environment, Closures (part 22)
- Query Folding I (part 23)
- Query Folding II (part 24)
- Extending the Global Environment (part 25)
- More to come!
Positional Selection
With type list
, remember how elements can be accessed by positional index using the selection operator (think: curly branches)? For example myList{2}
returns the third element from myList. (Reminder: M’s indexes are zero-based, so the first element is considered to be at index 0, the second at index 1, etc.)
Type table
also supports the selection operator—except with this type, the index identifies which row to return. The identified row is returned as a record, with each column represented as a field in the record. (Records are usually [always?] the way M returns single rows from tables.)
let
Parts = #table(
{ "Code", "Category", "PriceGroup" },
{
{ 123, "Widget", "A" },
{ 456, "Thingamajig", "B" },
{ 789, "Widget", "B" }
}
)
in
Parts{1} // returns the second row as a record: [Code = 456, Category = "Thingamajig", PriceGroup = "B"]
(Above, library function #table is used to create a table. In this example, the first argument is a list of column names. The second is a list of lists which defines row values: each item in the outer list corresponds to a row; each item in the inner list corresponds with a column value for that row.)
If the requested index does not exist, the behavior is the same as with list: an error is returned but this can be changed to null by making the selection optional (a.k.a. by appending a ?
after the curly braces).
Parts{3} // Expression.Error: There weren't enough elements in the enumeration to complete the operation.
Parts{3}? // null
The standard library contains numerous functions that work with table rows based on position. Examples include: Table.SingleRow, Table.First, Table.Last, Table.Skip, Table.FirstN, Table.LastN, Table.Range, Table.PositionOf, Table.ReverseRows.
Value-Based Selection
With tables, the selection operator can also be used to select a single row based on field value(s). To do this, inside the curly braces, pass a record defining the search criteria to use. Each field name should correspond to a table column to search and each field value should correspond with the value to search for in that particular column. It’s unnecessary for the record to contain fields for every column in the table; only those you want searched need to be included.
Parts{[Code = 123]} // returns [Code = 123, Category = "Widget", PriceGroup = "A"]
Parts{[Category = "Thingamajig"]} // [Code = 456, Category = "Thingamajig", PriceGroup = "B"]
Parts{[Category = "Widget", PriceGroup = "B"]} // [Code = 789, Category = "Widget", PriceGroup = "B"]
The selection operator returns, at most, one row, represented as a record. If the search criteria match more than a single row, an error is returned. Using an optional selection (by appending the question-mark, like {[…]}?
) does not suppress the error. Optional selection only affects behavior when no match is found, not when multiple matches are found.
Parts{[Category = "Other"]} // returns Expression.Error: The key didn't match any rows in the table.
Parts{[Category = "Other" ]}? // returns null
Parts{[Category = "Widget"]} // Expression.Error: The key matched more than one row in the table.
Parts{[Category = "Widget"]}? // same error as above (using ? doesn't suppress it)
Searching is performed using the criteria record’s values. If you’d rather dynamically decide which row to return, you’ll need to use an expression—which is possible using the standard library (hint: try something along the lines of Table.SingleRow(Table.SelectRows(Parts, each ...some expression...))
).
Column Access (a.k.a. Field Selection)
Like record
, type table
also supports field selection. With a record, the field name of interest goes between the square brackets (like: myRecord[First Name]
) and that field’s value is returned. When field selection is performed on a table, the column name of interest goes between the square brackets (like: myTable[Some Column]
)…but since a column can contain a whole list of values, what’s returned? Yup, you guessed it: a list!
The list that’s returned contains the value from each row for the specified column, in the order those values appear in that column (so the first item in the list will correspond with the value of the column from the first row, the second item with the value from the second row, etc.).
Parts[Category] // returns { "Widget", "Thingamajig", "Widget" }
If you would, pause for a moment and consider the ramifications of this behavior. By returning a list, M allows you to work with the set of values in a column using the same functions you’d use on any other list. M’s standard library doesn’t need to provide a separate set of table functions for single-column operations (like: Table.DistinctValuesInColumn(Parts, "PriceGroup")
) because it can work with a single column as a list (like List.Distinct(Parts[PriceGroup]))
). A column is a list—at least as far as list functions are concerned! What beautiful simplicity!
Projection
Perhaps you have a table where you only want to work with (or output) some of the columns. Like record
, table
supports projection. In the case of tables, projection produces a new table that contains a reduced set of columns. The projection operator shares square bracket syntax with field/column selection.
Parts[[Code], [Category]] // returns a two-column table consisting of Code & Category
If a specified column does not exist, an error is returned; however, if the projection is made optional (by appending a question-mark to the square brackets), the non-existent column(s) will be included in the new table with values set to null.
Parts[[Code], [Price]] // Expression.Error: The column 'Price' of the table wasn't found.
Parts[[Code], [Price]]? // returns a two-column table, where Price's values are all null (since that column does not exist in the source)
Column names must be hard-coded. If you’d rather provide the column names as a list, library function Table.SelectColumns is your friend.
Comparison
If two tables contain the same number of columns, with the same names, and the same number of rows, where the same-name columns from each table have the same values in each position (e.g. Col1, index 0 in table 1 equals Col1, index 0 in table 2), the equality operator (=
) considers the tables equal. Otherwise (obviously!), they’re not equal (<>
).
#table({"Col1"}, {{1}, {2}}) = #table({"Col1"}, {{1}, {2}}) // true
#table({"Col1"}, {{1}, {2}}) <> #table({"Col1"}, {{1}, {2}}) // false
#table({"Col1", "Col2"}, {{"A", "B"}, {1,2}, {3,4}}) = #table({"Col2", "Col1"}, {{"B", "A"}, {2,1}, {4,3}}) // true--even though columns are in different positions, they contain the same values at the same positions, so the tables are equivalent
#table({"Col1"}, {{1}, {2}}) = #table({"Col1"}, {{2}, {1}}) // false--even though the tables contain the same rows, they are ordered differently, so the tables are not equivalent
Column schema and metadata do not have to match in order for tables to be considered equal.
let
Table1 = Table.TransformColumnTypes(#table({"Col1"}, {{1}}),{{"Col1", type any}}),
Table2 = Table.TransformColumnTypes(#table({"Col1"}, {{1}}),{{"Col1", type number}})
in
Table1 = Table2 // returns true even though the column types differ (any vs number)
You could think of it this way: M considers two tables to be equal if they contain the same data. Things like column order, schema details and metadata are accessory items, not data proper, so are not included in the equality comparison. (For example, you can get the exact same data out of a table regardless of how its columns are positioned, so column order isn’t factored in to the “are these tables equal?” decision.)
Combining Tables
Two tables can be combined using the combination (&
) operator.
#table({"City", "State"}, {{"Chicago", "IL"}}) & #table({"City", "State"}, {{"Washington", "DC"}})
Columns are paired by name, not position. If a column exists in only one of the tables, it’s values will be set to null for rows from the other table.
#table({"City"}, {{"Chicago"}}) & #table({"State"}, {{"DC"}})
In the table that’s returned, all columns from the first table are outputted first, positioned based on the order in which they appear in that table (e.g. first column from that table is outputted first, second comes second, etc.). Then, any columns that are only in the second table are outputted, positioned based on where they appear in that table (e.g. first column exclusive to that table is outputted, followed by the second exclusive column, etc.).
#table({"City", "State"}, {{"Chicago", "IL"}}) & #table({"State", "City"}, {{"Washington", "DC"}})
M does not require that column data types be compatible. The resulting table from the below contains a string value and a numeric value.
#table({"Age"}, {{18}}) & #table({"Age"}, {{"21"}})
These last three behaviors are a bit different from SQL’s UNION ALL
operator. With SQL, combining two tables requires that the tables contain the same number of columns, columns are combined based on position (not name) and columns being combined must have compatible data types. With Power Query, tables with a dissimilar number of columns may be concatenated together, columns are paired by name (not position) and column data types are irrelevant for combination purposes.
Next Time
At the language syntax level, that’s it—there you have tables. As was already mentioned, the standard library contains many functions around tables—but since that’s outside the context of this series on the Power Query language, we’re done as far as tables go, right?
Not so fast! There’s syntax, which we just covered, and then there’s the environment in which that syntax is executed. In essence, how Power Query processes tables. Streaming, query folding, buffering, keys, native query caching, the data protection layer/firewall…all of these can have a major impact on the performance and/or behavior of how M processes your tables.
Next time, let’s delve deeper into streaming and query folding (which should be familiar from earlier in this series) then explore buffering. The plan is for the post after next to cover keys, native query caching and the data protection layer/firewall.
Until then, happy coding!
Thanks, Ben. Great posts! Just have read the whole series from part 1 thru 11 again in these two days. Look forward to the next one!
Thanks for the great series Ben. Just read all 11 articles.
Hello,
very good series of articles. Although I know Power Query well, I still learnt some useful tricks. However, I didn’t see anything about “
try ... otherwise ...
” nor about error handling. Do you plan to make an article about this topic ?Thank you for your kind words, fred! Good question! Yes, writing about exception/error handling is in the plans.
I am so glad to read this chapter, please continue your serie, it is the best ever about Power Query
Incredible work!, I really hope you continue with these posts!
I also agree with the previous comments about including some posts about error handling and parameters.
I have learned from the ground up with you. Keep up the good work!
Thank you for your kind words! It’s in the plans to include error handling in this series. Technically, parameters are an environment feature, not a language feature, so I’m not sure if they’ll make it in the series yet. If not, I’ll keep in mind the idea of doing a separate post on them.