Power Query M Primer (Part 18): Type System III – Custom Types

, , , ,

After gaining a solid grasp on type system basics and exploring type facets, it’s now time to learn about complex types (also known as custom types or derived types). Thankfully, constructing these types isn’t very complex. Don’t worry, though, we’ll make up for that simplicity when we talk about how M handles them!

Let’s dive right in. To keep things simple, the focus for this post is syntax and conformance rules. We’ll save most of the discussion about how M works with these types for the next post in this series.

“Conformance—that sounds complicated!” you might be thinking. Actually, no. Just the name makes it sound that way. Saying that a value conforms to a type means that the value can be described by the given type, or to put it in other words, the value is compatible with the type. So, the numeric value 1 conforms to types number, nullable number, anynonnull and any because each of those types can be used to describe that value. Conformance rules, simply put, are the rules used to determine whether a value conforms to—is described by—a type.

Now, on to the custom types!

Series Index

List

type list can also be written type { any }. The curly braces indicate that a list type is being defined. In between those braces is the item type of the list (which, in this case is type any).

The item type describes the values that make up the list. All the values in a list should conform to the list’s item type. For example, a list of type { number} should only contain values of type number. No nulls. No text. No nothing else. (Interestingly, M doesn’t enforce this rule—but more on this next time.)

By default, when a list is created, it’s created having type list, which is the equivalent of type { any }. A list’s item type can be change to something more specific by ascribing a custom list type to the list.

First define a custom list type by specifying an item type that’s something other than any:

type { number } // describes a list of numbers
type { text} // describes a list of strings
type { { function } } // describes a list containing a list of functions

Then ascribe the new type to the list:

let
  MyList = { 1, 2, 3 } // automatically of type list, which is the same as type { any }
in
  Value.ReplaceType(MyList, type { number }) // returns a list with an item type of number

As you may have noticed, in Query Editor when a type is displayed on screen, only the name of its base type is shown. For example, the result of type { number } will be displayed as simply “type list” even though it really is a descendant of type list. For better or worse, this is how the UI renders custom types.

If you’d like to read out the “custom” details from a custom type, standard library functions are available. In the case of a list type, you can programmatically read its item type by using standard library function Type.ListItem.

let
  SomeType = type { date }
in
  Type.ListItem(SomeType) // returns type date

Record

A custom record type is defined using the following syntax, where the record’s field names and (optionally) types are specified between square brackets. If a field is not assigned a type, its type defaults to any.

type [ItemCode = text, Amount]

The above defines a record type that describes a record having two fields, ItemCode and Amount, holding values that it claims conform to type text (because it is explicitly stated) and type any (by default), respectively.

All records have a custom type—either one assigned by the system or one defined by you. When a new record is created, the system automatically gives it a custom type, which lists the fields in the record with each field’s type set to any. For example, the record [Name = "Joe", Age = 50] is automatically assigned a type of type [Name = any, Age = any].

Just like type list, you can change this by defining a new custom type then ascribing it to the record:

let
  Person = [Name = "Joe", Age = 50], // intrinsically of type [Name = any, Age = any]
  NewType = type [FullName = text, Age = number]
in
  Value.ReplaceType(Person, NewType) // returns [Name = "Joe", Age = 50], which is a record of type [Name = text, Age = number]

The new type’s field information is applied to the record’s fields based on their positions. That is, the definition of the first field in the new type replaces the current type definition for the record’s first field, the second field in the type replaces the type for the record’s second field, and so forth. (In contrast, when checking whether a record value conforms to a record type, fields should be matched by name, with position ignored.)

The mashup engine does not check field names during ascription. You could ascribe a record whose first field is named FieldA with a type whose first field is Field1 and the engine won’t complain. However, please don’t do this! It is important for field names to stay the same when ascribing a new type; otherwise, unexpected behavior may result—as other pieces of internal engine code may expect the field names on the type and those on the value to be in sync.

Similar to type list, when ascribing a custom record type, the mashup engine does not check field types for compatibility. You could ascribe the record produced by the previous example with type [A = logical, B = logical] and M would happily apply the new type, even though field values “Joe” and 50 are not of type logical. This behavior may not seem logical—and we’ll discuss that point next time. For now, just be aware of it.

The syntax for custom record types offers a couple other options, both of which create abstract types:

type [FirstName = text, optional LastName = text, Age = number]
type [Amount = number, ...]

The first type in the preceding has an optional field. Just like the name suggests, an optional field may be optionally present.

The second type has “...” at the end of its field list. In this context, ... is known as an open record marker. When present, it makes the type an open record type. An open record type allows any number of additional fields (including zero additional fields) beyond those explicitly specified.

Why abstract types?

Remember that no value can directly be an abstract type. Since record types with optional fields and those that are open are both abstract, you’ll never see a record value whose type has either of these. If a value can’t have these in its type, what’s the point of defining them?

Describing expectations or, to put it in other words, classifying values.

You might have a function which expects as its argument a record containing a numeric Amount field. If you defined the function argument as type [Amount = number], you’d be saying that the function should only be passed records containing a single field, named Amount, of type number. In contrast, if you change the type to type [Amount = number, ...], you’d be saying that the records passed in should contain a numeric Amount field but may also optionally contain any number of other fields. The type one states exactly what is expected (no more and no less) while the other defines the minimum requirement (at least this much but more is allowed). Which approach is appropriate depends on your situation.

types record & []

type record itself is the equivalent of type [...], which is an abstract type describing records having any number of fields, including no fields. This means that all records are compatible with type record even though, since it is abstract, no record value can ever be directly of type record.

Oh, before we leave record types, yes, it is possible to define a record type that describes an empty record—a record containing zero fields:

type []

You probably won’t often (maybe never!) find yourself defining this type or manually creating empty records but if you programmatically remove fields from a record you could end up with an empty record, which would be of this type—so there is reason for it to exist.

Table

Based on what we’ve already covered, you can probably make sense out of the following syntax for defining a custom table type:

type table [SomeColumn = text, AnotherColumn = any, YetOneMore]

The square-bracketed part defines the table’s row type. In essence, the row type is a record type with two special rules imposed: Row types cannot have optional fields or be open, with the exception of one built-in type (more on this in a moment)

You could argue that the ability to do something like type table [Amount = number, ...] would be useful because it would let you say “I expect a table containing column Amount, of type number, but other columns are okay,” but that’s not supported. In addition to allowing a table to contain a varying number of columns, allowing open row types would allow ragged tables: tables where each row in the table can vary in column count (so row 1 might have 3 extra columns, row 2 no extra columns, row 3 maybe 1 extra column and so forth). M doesn’t support ragged tables.

Similar to record, the native type of a table contains all its column names with their types set to any.

#table({"EmployeeID", "Type", "Wage"}, {...})

Produces a table having the following type:

type table [EmployeeID = any, Type = any, Wage = any]

The base type of all tables is type table, which is abstract. Its row type is the empty open record (the only time a table type can have an open row type). So, type table is compatible with tables whose rows have any number of columns (including zero columns), which makes it compatible with all tables.

Table types include something no other type carries: the definition of the table’s keys. For example, the following table type includes a primary key on column OrderID:

let
  BasicTableType = type table [OrderID = number, Total = number, Shipped = logical]
in
  Type.AddTableKey(BasicTableType, {"OrderID"}, true)

The fact that a table’s type contains keys means that two table types with the same columns and column types are still different types if the keys defined on them differ.

The standard library also contains functions that allow you to add and replace keys on table values, in contrast to the above example which worked with a table type.

let
  SomeTable = #table({"Col1", "Col2"}, {{"a", "b"}})
in
  Table.AddKey(SomeTable, {"Col1"}, true)

You could think of the functions that work on a table value’s keys as shortcuts which take the table value’s type, add or replace keys on it, ascribe the revised type onto the value then return the revised table. The above example is more succinct than coding up all those steps manually—which would look something like:

let
  SomeTable = #table({"Col1", "Col2"}, {{"a", "b"}}),
  StartingType = Value.Type(SomeTable),
  UpdatedType = Type.AddTableKey(StartingType, {"Col1"}, true)
in
  Value.ReplaceType(SomeTable, UpdatedType)

Similar to type record, the table row type’s column details are applied positionally to the table, so the type details for the first column defined in the new type will be applied to the first column in the table, the type’s second column definition to the second column in the table, and so forth.

Like record, it is important for column names to stay the same across the ascription even though the mashup engine does not enforce this; otherwise unexpected behavior may result. Please don’t try to use ascription to rename table columns!

Below is an example of the danger of not following this guideline. M sees the “renamed” column sometimes as having its old name and sometimes with its new name—and when and where these inconsistencies appear could change across mashup engine versions.

let
  MyTable = #table({"Col1"}, {{"Joe"}}),
  TypeChanged = Value.ReplaceType(MyTable, type table [Name = any]) // returns a table which appears to have a column named Name
in
  Table.SelectRows(TypeChanged, each [Name] = "Joe") // uses column's new name -- errors: "Expression.Error: The field 'Name' of the record wasn't found."
  // Table.SelectRows(TypeChanged, each [Col1] = "Joe") // uses column's old name - works

  // TypeChanged[Name] // uses column's new name -- works
  // TypeChanged[Col1] // uses column's old name -- errors: "Expression.Error: The column 'Col1' of the table wasn't found."

Like list and record, the mashup engine does not check whether the new table type’s column types are compatible with the values actually in the table. Strange, I know. More on this next time.

Function

For defining custom function types, syntax is in the form of:

type function(name as text, age as number) as record
type function(amount as number, optional tax as number) as number

Important: When defining functions, parameter and return value type assertions are optional; however, when defining function types, both must be specified:

// valid function definition -- no argument or return assertions are specified
(total) => total * 0.1
 
// invalid function type definition -- argument and return assertions must be defined
type function(total)
 
// valid function type definition -- all assertions specified
type function(total as number) as number

When a function is defined, the custom type created automatically for it defaults any omitted type assertions to any. For example, the native type of the following is type function (total as any) as any:

(total) => total * 0.1

When ascribing a new type onto a function, the number of parameters defined must match (obviously). Also, the new type cannot change required parameters to optional or vice versa.

You might think that updating an optional parameter to make it required would be allowed because this change seems safe, but it isn’t. Ascribing a new type to a function changes information about it but does not change its behavior—and changing an argument’s optional/required status is a behavior change.

Similar to record and table, the new type’s parameters are matched to the function’s parameters based on position. Even though the mashup engine does not validate that their names stay the same, do not use ascription to rename arguments—otherwise, unexpected behavior could result.

As type ascription can only change information about a function, not the function’s behavior at the language level, ascribing a type onto a function that specifies different argument or return assertions has no effect on the behavior of the function. The mashup engine always uses the type assertions specified when the function was originally defined, even if another type is later ascribed onto it.

Below defines SomeFunction as expecting a single argument that is compatible with type text. Then, a new type is ascribed onto the function, specifying that the argument’s type is number. This new type affects the information reported about the function (e.g. documentation will report the argument’s type as number) but does not change its behavior. The function was defined as expecting text and still expects text despite what the new type claims. If non-text value is passed, the function will reject it as invalid.

let
  SomeFunction = (input as text) => "hi",
  NewType = type function (input as number) as any,
  Ascribed = Value.ReplaceType(SomeFunction, NewType)
in
  Ascribed(1) // error

Syntax

Above, we assembled various type values by defining their component types (item types, row types, field types and argument/return types) inline using literal type names. When building type values, expressions may also be used when specifying component types. This makes it possible to compose complex types using variables and even the output of functions.

Let’s start with a simple example. Below, the column in the first table type’s row type is defined using the literal type number. In contrast, the second table type defines the column type using the expression Currency.Type (which, as we learned in part 17, when evaluated, returns a type value of type number faceted with Currency.Type). This expression could just as easily be a function call instead of an identifier reference.

type table [Amount = number]
type table [Amount = Currency.Type]

As already mentioned, variables can also be used when composing types:

let
  ColumnType = type number,
  RowType = type [SomeColumn = ColumnType],
  TableType = type table RowType,
  ListType = type { TableType }
in
  type function (list as ListType) as any

The above evaluates to a function type with a single argument of type list which has an item type of type table [SomeColumn = number]. In other words, this type describes a function that expects to be passed a list of tables (zero or more tables), each having a single column named SomeColumn filled with values of type number.

Type Context

When creating type values, the keyword type is what causes literal type names to have their special meaning. type switches M syntax interpretation into what’s known as the type context, which simply means that type keywords take on their special meaning. For example, by itself the expression any references a variable named “any”; while in the case of type any, the “type” shifts the expression into a context where “any” is interpreted as referring to the type named “any”.

Notice below how type shifts the entire expression into type context. The “type” at the beginning causes both “table” and “any” to be interpreted as types.

type table [Col1 = any]

While in type context, identifiers whose names are not the same as type names are still interpreted as normal identifier references.

let 
  ColumnType = type any
in
  type table [Col1 = ColumnType]

In some cases, you might find your code in a type context where you want to reference a variable with the same name as a type. Take type {record} as an example. Since “type” put the expression in a type context, “record” is interpreted as the type record. What if, instead, you’re trying to use “record” to reference a variable named record? You could rename your variable to something else, but that might be too easy. Alternatively, use parenthesis to switch the part of the expression they surround back to regular context, where “record” will be interpreted as a variable reference.

let
  record = type [A = text, B = logical]
in
  type { (record) } // defines a list whose item type is type [A = text, B = logical]
  // type { record } // defines a list whose item type is type record

A more complex example, switching into type context, then back out, then back in:

type { (type number) }

Of course, it would have been simpler to just write type { number } but at least now you know that nested switching in and out is possible.

Only for Type Values

The above syntax discussion about type context and using expressions while composing types only applies when coding up expressions that return type values you can save to a variable. When defining functions (not function types, but functions themselves), as well as when using the as and is operators, only literal nullable primitive types may be used. No “type” or parenthesis to shift in and out of type context. No expressions. No custom types.

1 as nullable number // valid syntax
1 as type nullable number // invalid syntax -- number should not be prefixed with type
 
SomeRecord is record // valid syntax
SomeRecord is [Name = text] // invalid syntax -- can't use a custom type here
 
(input as table) => ... // valid syntax
(input as [Col1 = any]) => ... // invalid syntax -- can't use a custom type here
(input as SomeType) => ... // invalid syntax -- can't use an expression for the type here

To make sure we’re clear on functions: The last two function declarations are syntactically invalid because they attempt to define a function with an argument assertion of a complex type and an expression (respectively). Neither of these are allowed in function definitions. They are both allowed in function types, which in turn can be ascribed onto an existing function. (Keep in mind, though, as mentioned previously, type ascription on functions only affects information, not behavior).

let
  Function = (input as table) => …,
  SomeType = type table [Col1 = any],
  NewType = type function (input as SomeType) as any
in
  Value.ReplaceType(Function, NewType)

Next Time

It is understandable if you are quite puzzled trying to make sense of out the purpose for custom types and why M doesn’t seem to handle them the way you might intuitively expect. I don’t like leaving things in an unresolved state, but we’ve covered so much ground here that we must let that discussion wait. Next time, hopefully, we can bring at least some clarity on the how and why. For now, focus on learning the syntax for custom types; then next time we can focus on how they behave without being distracted by trying to learn how to define them at the same time.

Until them, happy M coding!

Reference

TechNet forum discussion with Power Query team member about ascription and renaming.

Revision History

2020-08-27: Updated to indicate that ascription should not be used to rename record fields, table columns or function parameters.

4 thoughts on “Power Query M Primer (Part 18): Type System III – Custom Types

  1. Joe Edwards

    Hey Ben, awesome article as always. Is it possible to describe a type dynamically using a text value? To emulate type number, using the context switching you describe above, none of the following work:
    type (number)
    type ("number")
    let TextNumber = "number", NewType = type (TextNumber) in NewType
    Is this permitted by the M language and how would you achieve it?
    Thanks, Joe

    Reply
    1. Ben Gribaudo Post author

      Hi Joe,

      Thanks for your kind words! Switching out of type context allows variables to be referenced whose names otherwise would be interpreted as type literals; however, switching does not cause a value in a string to be treated as a type value.

      To pull that off, one option would be to use a record to translate between expected string values and type values:

      let 
        TextNumber = "number", 
        Transform = (typeName as text) as type =>
          let
            Mappings = [
              number = type number,
              text = type text,
              logical = type logical
            ]
          in
            Record.Field(Mappings, typeName),
        NewType = Transform(TextNumber)
      in
        NewType
      

      Does that help?

      Reply
      1. Joe Edwards

        Hi Ben,

        Great suggestion, since there are only a finite number of types it will be very easy to create a record that covers all options. It was mainly a general interest question, but a specific purpose I had in mind was that for my obscure source database (Progress OpenEdge) it allows for array data types (e.g. “1;a;2;b”) and I wanted to store the array element type conversion in a separate configuration database. Clearly I can’t return an M type value from a database query (other than creating a column of that type and extracting its type using Value.Type), so I wanted to be able to set the type of each array element from a string value from the configuration database.

        Your suggestion should definitely work, thanks!

        Joe

        Reply

Leave a Reply to Joe Edwards Cancel reply

Your email address will not be published. Required fields are marked *