Power Query M Primer (Part 18): Type System III – Custom Types

, , , ,

After gaining a solid grasp on type system basics and exploring type facets, it’s now time to learn about complex types (also known as custom types or derived types). Thankfully, constructing these types isn’t very complex. Don’t worry, though, we’ll make up for that simplicity when we talk about how M handles them!

Let’s dive right in. To keep things simple, the focus for this post is syntax and conformance rules. We’ll save most of the discussion about how M works with these types for the next post in this series.

“Conformance—that sounds complicated!” you might be thinking. Actually, no. Just the name makes it sound that way. Saying that a value conforms to a type means that the value can be described by the given type, or to put it in other words, the value is compatible with the type. So, the numeric value 1 conforms to types number, nullable number, anynonnull and any because each of those types can be used to describe that value. Conformance rules, simply put, are the rules used to determine whether a value conforms to—is described by—a type.

Now, on to the custom types!

Series Index

List

type list can also be written type { any }. The curly braces indicate that a list type is being defined. In between those braces is the item type of the list (which, in this case is type any).

The item type describes the values that make up the list. All the values in a list should conform to the list’s item type. For example, a list of type { number} should only contain values of type number. No nulls. No text. No nothing else. (Interestingly, M doesn’t enforce this rule—but more on this next time.)

By default, when a list is created, it’s created having type list, which is the equivalent of type { any }. A list’s item type can be change to something more specific by ascribing a custom list type to the list.

First define a custom list type by specifying an item type that’s something other than any:

type { number } // describes a list of numbers
type { text} // describes a list of strings
type { { function } } // describes a list containing a list of functions

Then ascribe the new type to the list:

let
  MyList = { 1, 2, 3 } // automatically of type list, which is the same as type { any }
in
  Value.ReplaceType(MyList, type { number }) // returns a list with an item type of number

As you may have noticed, in Query Editor when a type is displayed on screen, only the name of its base type is shown. For example, the result of type { number } will be displayed as simply “type list” even though it really is a descendant of type list. For better or worse, this is how the UI renders custom types.

If you’d like to read out the “custom” details from a custom type, standard library functions are available. In the case of a list type, you can programmatically read its item type by using standard library function Type.ListItem.

let
  SomeType = type { date }
in
  Type.ListItem(SomeType) // returns type date

Record

A custom record type is defined using the following syntax, where the record’s field names and (optionally) types are specified between square brackets. If a field is not assigned a type, its type defaults to any.

type [ItemCode = text, Amount]

The above defines a record type that describes a record having two fields, ItemCode and Amount, holding values that it claims conform to type text (because it is explicitly stated) and type any (by default), respectively.

All records have a custom type—either one assigned by the system or one defined by you. When a new record is created, the system automatically gives it a custom type, which lists the fields in the record with each field’s type set to any. For example, the record [Name = "Joe", Age = 50] is automatically assigned a type of type [Name = any, Age = any].

Just like type list, you can change this by defining a new custom type then ascribing it to the record:

let
  Person = [Name = "Joe", Age = 50], // intrinsically of type [Name = any, Age = any]
  NewType = type [FullName = text, Age = number]
in
  Value.ReplaceType(Person, NewType) // returns [FullName = "Joe", Age = 50], which is a record of type [FullName = text, Age = number]

The new type’s field information is applied to the existing record’s fields based on field position, not field name. That is, the definition of the first field from the new type will replace the definition (field name and type information) of the record’s first field, the new type’s second field definition will become the definition of the record’s second field, and so forth—regardless of whether field names match between the existing record and the new type. (In contrast to ascription, when checking whether a record value conforms to a given record type, fields should be matched by name, with position ignored.)

Technically, record type ascription can be used to rename a record’s fields. In fact, the above example demoed this: its ascription changed the record’s first field’s name from Name to FullName.

However, often it is preferable to instead perform record field renames using Record.RenameFields. This method applies the specified renames by name, regardless of the fields’ positions, so will continue to work as expected even if the input record’s field order is later changed.

Below, the specified rename is applied to Name, and will stay applied to Name even if Name‘s position is later reordered in MyRecord:

Record.RenameFields(MyRecord, {{"Name", "FirstName"}}) // returns a record exactly the same as the input, except with field Name -> FirstName

Similar to type list, when ascribing a custom record type, the mashup engine does not check field types for compatibility. You could ascribe the record produced by the previous example with type [A = logical, B = logical] and M would happily apply the new type, even though field values “Joe” and 50 are not of type logical. This behavior may not seem logical—and we’ll discuss that point next time. For now, just be aware of it.

The syntax for custom record types offers a couple other options, both of which create abstract types:

type [FirstName = text, optional LastName = text, Age = number]
type [Amount = number, ...]

The first type in the preceding has an optional field. Just like the name suggests, an optional field may be optionally present.

The second type has “...” at the end of its field list. In this context, ... is known as an open record marker. When present, it makes the type an open record type. An open record type allows any number of additional fields (including zero additional fields) beyond those explicitly specified.

Why abstract types?

Remember that no value can directly be an abstract type. Since record types with optional fields and those that are open are both abstract, you’ll never see a record value whose type has either of these. If a value can’t have these in its type, what’s the point of defining them?

Describing expectations or, to put it in other words, classifying values.

You might have a function which expects as its argument a record containing a numeric Amount field. If you defined the function argument as type [Amount = number], you’d be saying that the function should only be passed records containing a single field, named Amount, of type number. In contrast, if you change the type to type [Amount = number, ...], you’d be saying that the records passed in should contain a numeric Amount field but may also optionally contain any number of other fields. The type one states exactly what is expected (no more and no less) while the other defines the minimum requirement (at least this much but more is allowed). Which approach is appropriate depends on your situation.

types record & []

type record itself is the equivalent of type [...], which is an abstract type describing records having any number of fields, including no fields. This means that all records are compatible with type record even though, since it is abstract, no record value can ever be directly of type record.

Oh, before we leave record types, yes, it is possible to define a record type that describes an empty record—a record containing zero fields:

type []

You probably won’t often (maybe never!) find yourself defining this type or manually creating empty records but if you programmatically remove fields from a record you could end up with an empty record, which would be of this type—so there is reason for it to exist.

Table

Based on what we’ve already covered, you can probably make sense out of the following syntax for defining a custom table type:

type table [SomeColumn = text, AnotherColumn = any, YetOneMore]

The square-bracketed part defines the table’s row type. In essence, the row type is a record type with two special rules imposed: Row types cannot have optional fields or be open, with the exception of one built-in type (more on this in a moment)

You could argue that the ability to do something like type table [Amount = number, ...] would be useful because it would let you say “I expect a table containing column Amount, of type number, but other columns are okay,” but that’s not supported. In addition to allowing a table to contain a varying number of columns, allowing open row types would allow ragged tables: tables where each row in the table can vary in column count (so row 1 might have 3 extra columns, row 2 no extra columns, row 3 maybe 1 extra column and so forth). M doesn’t support ragged tables.

Similar to record, the native type of a table contains all its column names with their types set to any.

#table({"EmployeeID", "Type", "Wage"}, {...})

Produces a table having the following type:

type table [EmployeeID = any, Type = any, Wage = any]

The base type of all tables is type table, which is abstract. Its row type is the empty open record (the only time a table type can have an open row type). So, type table is compatible with tables whose rows have any number of columns (including zero columns), which makes it compatible with all tables.

Table types include something no other type carries: the definition of the table’s keys. For example, the following table type includes a primary key on column OrderID:

let
  BasicTableType = type table [OrderID = number, Total = number, Shipped = logical]
in
  Type.AddTableKey(BasicTableType, {"OrderID"}, true)

The fact that a table’s type contains keys means that two table types with the same columns and column types are still different types if the keys defined on them differ.

The standard library also contains functions that allow you to add and replace keys on table values, in contrast to the above example which worked with a table type.

let
  SomeTable = #table({"Col1", "Col2"}, {{"a", "b"}})
in
  Table.AddKey(SomeTable, {"Col1"}, true)

You could think of the functions that work on a table value’s keys as shortcuts which take the table value’s type, add or replace keys on it, ascribe the revised type onto the value then return the revised table. The above example is more succinct than coding up all those steps manually—which would look something like:

let
  SomeTable = #table({"Col1", "Col2"}, {{"a", "b"}}),
  StartingType = Value.Type(SomeTable),
  UpdatedType = Type.AddTableKey(StartingType, {"Col1"}, true)
in
  Value.ReplaceType(SomeTable, UpdatedType)

Similar to type record, the table row type’s column details are applied positionally to the table, so the type details for the first column defined in the new type will be applied to the first column in the table, the type’s second column definition to the second column in the table, and so forth.

This means that, like record, type ascription can be used to rename table columns—at least, in theory. However, there’s a mashup engine bug which keeps this from working properly. So, do not try to rename columns using type ascription!

Below is an example of the danger of not following this guideline. M sees the “renamed” column sometimes as having its old name and sometimes with its new name—and when and where these inconsistencies appear could change across mashup engine versions.

let
  MyTable = #table({"Col1"}, {{"Joe"}}),
  TypeChanged = Value.ReplaceType(MyTable, type table [Name = any]) // returns a table which appears to have a column named Name
in
  Table.SelectRows(TypeChanged, each [Name] = "Joe") // uses column's new name -- errors: "Expression.Error: The field 'Name' of the record wasn't found."
  // Table.SelectRows(TypeChanged, each [Col1] = "Joe") // uses column's old name - works

  // TypeChanged[Name] // uses column's new name -- works
  // TypeChanged[Col1] // uses column's old name -- errors: "Expression.Error: The column 'Col1' of the table wasn't found."

Even if this bug is fixed in the future, you’re still probably better of using Table.RenameColumns as your regular go-to for column renaming. It matches by name, not position, and preserves existing column types, both of which combined make it generally more robust.

Like list and record, the mashup engine does not check whether the new table type’s column types are compatible with the values actually in the table. Strange, I know. More on this next time.

Function

For defining custom function types, syntax is in the form of:

type function(name as text, age as number) as record
type function(amount as number, optional tax as number) as number

Important: When defining functions, parameter and return value type assertions are optional; however, when defining function types, both must be specified:

// valid function definition -- no argument or return assertions are specified
(total) => total * 0.1
 
// invalid function type definition -- argument and return assertions must be defined
type function(total)
 
// valid function type definition -- all assertions specified
type function(total as number) as number

When a function is defined, the custom type created automatically for it defaults any omitted type assertions to any. For example, the native type of the following is type function (total as any) as any:

(total) => total * 0.1

When ascribing a new type onto a function, the number of parameters defined must match (obviously). Also, the new type cannot change required parameters to optional or vice versa.

You might think that updating an optional parameter to make it required would be allowed because this change seems safe, but it isn’t. Ascribing a new type to a function changes information about it but does not change its behavior—and changing an argument’s optional/required status is a behavior change.

Similar to record and table, the new type’s parameters are matched to the function’s parameters based on position. Even though the mashup engine does not validate that their names stay the same, do not use ascription to rename arguments—otherwise, unexpected behavior could result.

As type ascription can only change information about a function, not the function’s behavior at the language level, ascribing a type onto a function that specifies different argument or return assertions has no effect on the behavior of the function. The mashup engine always uses the type assertions specified when the function was originally defined, even if another type is later ascribed onto it.

Below defines SomeFunction as expecting a single argument that is compatible with type text. Then, a new type is ascribed onto the function, specifying that the argument’s type is number. This new type affects the information reported about the function (e.g. documentation will report the argument’s type as number) but does not change its behavior. The function was defined as expecting text and still expects text despite what the new type claims. If non-text value is passed, the function will reject it as invalid.

let
  SomeFunction = (input as text) => "hi",
  NewType = type function (input as number) as any,
  Ascribed = Value.ReplaceType(SomeFunction, NewType)
in
  Ascribed(1) // error

Syntax

Above, we assembled various type values by defining their component types (item types, row types, field types and argument/return types) inline using literal type names. When building type values, expressions may also be used when specifying component types. This makes it possible to compose complex types using variables and even the output of functions.

Let’s start with a simple example. Below, the column in the first table type’s row type is defined using the literal type number. In contrast, the second table type defines the column type using the expression Currency.Type (which, as we learned in part 17, when evaluated, returns a type value of type number faceted with Currency.Type). This expression could just as easily be a function call instead of an identifier reference.

type table [Amount = number]
type table [Amount = Currency.Type]

As already mentioned, variables can also be used when composing types:

let
  ColumnType = type number,
  RowType = type [SomeColumn = ColumnType],
  TableType = type table RowType,
  ListType = type { TableType }
in
  type function (list as ListType) as any

The above evaluates to a function type with a single argument of type list which has an item type of type table [SomeColumn = number]. In other words, this type describes a function that expects to be passed a list of tables (zero or more tables), each having a single column named SomeColumn filled with values of type number.

Type Context

When creating type values, the keyword type is what causes literal type names to have their special meaning. type switches M syntax interpretation into what’s known as the type context, which simply means that type keywords take on their special meaning. For example, by itself the expression any references a variable named “any”; while in the case of type any, the “type” shifts the expression into a context where “any” is interpreted as referring to the type named “any”.

Notice below how type shifts the entire expression into type context. The “type” at the beginning causes both “table” and “any” to be interpreted as types.

type table [Col1 = any]

While in type context, identifiers whose names are not the same as type names are still interpreted as normal identifier references.

let 
  ColumnType = type any
in
  type table [Col1 = ColumnType]

In some cases, you might find your code in a type context where you want to reference a variable with the same name as a type. Take type {record} as an example. Since “type” put the expression in a type context, “record” is interpreted as the type record. What if, instead, you’re trying to use “record” to reference a variable named record? You could rename your variable to something else, but that might be too easy. Alternatively, use parenthesis to switch the part of the expression they surround back to regular context, where “record” will be interpreted as a variable reference.

let
  record = type [A = text, B = logical]
in
  type { (record) } // defines a list whose item type is type [A = text, B = logical]
  // type { record } // defines a list whose item type is type record

A more complex example, switching into type context, then back out, then back in:

type { (type number) }

Of course, it would have been simpler to just write type { number } but at least now you know that nested switching in and out is possible.

Only for Type Values

The above syntax discussion about type context and using expressions while composing types only applies when coding up expressions that return type values you can save to a variable. When defining functions (not function types, but functions themselves), as well as when using the as and is operators, only literal nullable primitive types may be used. No “type” or parenthesis to shift in and out of type context. No expressions. No custom types.

1 as nullable number // valid syntax
1 as type nullable number // invalid syntax -- number should not be prefixed with type
 
SomeRecord is record // valid syntax
SomeRecord is [Name = text] // invalid syntax -- can't use a custom type here
 
(input as table) => ... // valid syntax
(input as [Col1 = any]) => ... // invalid syntax -- can't use a custom type here
(input as SomeType) => ... // invalid syntax -- can't use an expression for the type here

To make sure we’re clear on functions: The last two function declarations are syntactically invalid because they attempt to define a function with an argument assertion of a complex type and an expression (respectively). Neither of these are allowed in function definitions. They are both allowed in function types, which in turn can be ascribed onto an existing function. (Keep in mind, though, as mentioned previously, type ascription on functions only affects information, not behavior).

let
  Function = (input as table) => …,
  SomeType = type table [Col1 = any],
  NewType = type function (input as SomeType) as any
in
  Value.ReplaceType(Function, NewType)

Next Time

It is understandable if you are quite puzzled trying to make sense of out the purpose for custom types and why M doesn’t seem to handle them the way you might intuitively expect. I don’t like leaving things in an unresolved state, but we’ve covered so much ground here that we must let that discussion wait. Next time, hopefully, we can bring at least some clarity on the how and why. For now, focus on learning the syntax for custom types; then next time we can focus on how they behave without being distracted by trying to learn how to define them at the same time.

Until them, happy M coding!

Revision History

  • 2020-08-27: Updated to indicate that ascription should not be used to rename record fields, table columns or function parameters. (Information partially superseded by a later revision.)
  • 2023-02-21: Updated to reflect that actually using ascription to rename record fields and table columns is technically valid, though not necessarily preferable and—in the case of tables—doesn’t work as expected due to a bug.

6 thoughts on “Power Query M Primer (Part 18): Type System III – Custom Types

  1. Joe Edwards

    Hey Ben, awesome article as always. Is it possible to describe a type dynamically using a text value? To emulate type number, using the context switching you describe above, none of the following work:
    type (number)
    type ("number")
    let TextNumber = "number", NewType = type (TextNumber) in NewType
    Is this permitted by the M language and how would you achieve it?
    Thanks, Joe

    Reply
    1. Ben Gribaudo Post author

      Hi Joe,

      Thanks for your kind words! Switching out of type context allows variables to be referenced whose names otherwise would be interpreted as type literals; however, switching does not cause a value in a string to be treated as a type value.

      To pull that off, one option would be to use a record to translate between expected string values and type values:

      let 
        TextNumber = "number", 
        Transform = (typeName as text) as type =>
          let
            Mappings = [
              number = type number,
              text = type text,
              logical = type logical
            ]
          in
            Record.Field(Mappings, typeName),
        NewType = Transform(TextNumber)
      in
        NewType
      

      Does that help?

      Reply
      1. Joe Edwards

        Hi Ben,

        Great suggestion, since there are only a finite number of types it will be very easy to create a record that covers all options. It was mainly a general interest question, but a specific purpose I had in mind was that for my obscure source database (Progress OpenEdge) it allows for array data types (e.g. “1;a;2;b”) and I wanted to store the array element type conversion in a separate configuration database. Clearly I can’t return an M type value from a database query (other than creating a column of that type and extracting its type using Value.Type), so I wanted to be able to set the type of each array element from a string value from the configuration database.

        Your suggestion should definitely work, thanks!

        Joe

        Reply
  2. Sawood

    Hi Ben,

    Just wanted to say that your power query deep dive articles are amazing! Thank you for your patience and hard work. I refer back to them often when I am have trouble understanding why M does something.

    Regarding function types. I don’t see their utility when they are used on functions that take arguments with primitive types ( text, number, logical etc.), because the behavior remains the same. It would be better to just specify the correct argument and return types right in the function definition.

    But when functions definitions take arguments with abstract types (table, list, record), ascribing function types is a good way to narrow down the types of arguments that can be sent as function inputs, i.e. if a function takes a table as an input, by default the table accepts ‘any’ type values, but with function typing, we could specify the table values to only accept text type values.

    Does that make sense? What’s your take?

    Thanks,
    Sawood.

    Reply
    1. Ben Gribaudo Post author

      Hi Sawood,

      Thank you for your kind words. I am glad you find the series helpful!

      I believe the main reason to ascribe a scalar type onto a function argument is to add documentation to the argument (in a nutshell, adding documentation involves attaching metadata to a type then ascribing that type to the function parameter). This approach is necessary because it is not possible to attach the metadata when the function is declared.

      For table, list and record, ascribing child types (column types, etc.) does not change the mashup engine’s behavior, so the mashup engine won’t validate that child values conform to the child types. However, the ascribed child types can be used by the host application (e.g. to help Power BI set its data types when it stores data) as well as read/used by M functions, including functions from the standard library. They also may be helpful from the documentation perspective.

      Does that help?

      Ben

      Reply

Leave a Reply to Ben Gribaudo Cancel reply

Your email address will not be published. Required fields are marked *