Power Query M Primer (Part 19): Type System IV – Ascription, Conformance and Equality’s “Strange” Behaviors

, , , ,

At first glance, it seems logical to expect the mashup engine to error if you attempt to ascribe a seemingly incompatible type to a value. Surprisingly, this doesn’t always occur.

During ascription, the engine validates that the type’s base type matches the value’s base type (so, for example, you can’t ascribe type logical onto a date value because the two base types are different). However, for types with children, those child types (list’s item type, record’s field types, table’s column types and function’s argument and return type assertions) have no effect on mashup engine behavior and are not validated for compatibility. What?!

For example, you can ascribe an item type of text onto a list of numbers, and the mashup engine won’t complain. Ascribe a column type of logical onto a column containing dates and M will comply. Similar holds true for records: a field type of duration can be ascribed onto a field containing a function and no error is raised.

Value.ReplaceType({1, 2, 3 }, type { text })
Value.ReplaceType(#table({"Col1"}, {{ #date(2020, 6, 24)}}), type table [Col1 = logical])
Value.ReplaceType([FieldA = () => ...], type [FieldA = duration])

For functions, the newly ascribed type’s argument and return type assertions aren’t validated or enforced; instead, the assertions specified when the function was originally defined continue to be what the engine follows. Take a function argument originally defined as number and ascribe as text to it. Despite the ascription, the mashup engine will expect the argument’s value to be compatible with number, not text, when the function is invoked.

let
  Func = (input as number) as number => input,
  NewType = type function (input as text) as text,
  Ascribed = Value.ReplaceType(Func, NewType)
in
  Ascribed("hi") // errors
  // Ascribed(1) // works fine  

These behaviors seem strange—and they aren’t the only strangeness related to Power Query’s types. Comparing type values may also not work the way you expect. Think TypeValueA = TypeValueB will return true if the two types are identical? Maybe. Maybe not!

Fasten your seat belt. We’ll try to define and then clear up a bit of this confusion. It will be a journey! Here we go….

Series Index

Type Equality

If you compare two type values using the equality operator (=), what result do you think you’ll receive back? Why, true if the two values describe the same type and false if they do not, right?

let
  TypeValueA = type ..., // some type value
  TypeValueB = type ... // some type value
in
  TypeValueA = TypeValueB

Surprisingly, with one exception, it is valid for false to be returned any time type values are compared for equality. The M language specification does not define rules for determining type equivalence, but the equality operator (=) still needs to produce a response, so according to the language spec, it can simply always return false (again, with one exception). Yes, false—even if you would consider the values being compared to be equal.

This seems most strange. Why not require the equality operator to work the way you’d expect for type values, so that someTypeValue = anotherTypeValue works the way you’d naturally assume it would?

Let’s think through type equality.

If you were defining type equivalence comparison rules, what would you expect the below to output? Why true, of course! type number obviously should be equal to type number, so it seems logical for this expression to evaluate to true.

type number = type number

A challenge with type equality is that the correct result isn’t always unambiguously obvious, like it is above. Or, to put it in other words, what you’d expect isn’t always what another person would expect. To illustrate, let’s make the scenario a little more complex: Should these two type values be considered equal? Why or why not?

type number = Currency.Type

Some of you are probably saying, “Yes—because Currency.Type is type number decorated with a sub-type claim. Since a sub-type claim has no effect on mashup engine behavior, it should be ignored when comparing the values. If it’s ignored, the comparison becomes the equivalent of type number = type number and so should evaluate to true.” Other readers are probably thinking, “But wait—the two type values are defined differently. Even though the Currency.Type claim on the one does not affect the mashup engine, a type with that claim is still describing something different from plain type number and so comparing the two should evaluate to false.”

Since we might differ in opinion as to the expected behavior, to make equality comparisons like these practical, precise rules need to be authoritatively defined that specify the expected outcomes. The language specification could have done this—but it doesn’t.

Do It Yourself

Instead of picking winners and losers, as far as preferences on how type equality should be defined, the Power Query language stays out of the argument. Instead, it leaves it to you to compare the specific aspects of the type values you consider important in your context when checking for equality.

How do you do this? Using the standard library’s type functions!

To determine whether a type value is of a certain base type, use Type.Is, specifying the base type you’re checking for.

Type.Is(type { number}, type list) // true

While the first argument can be any type value, the second argument must be a nullable primitive type (so not a custom type). Interestingly, the function won’t necessarily error if a non-nullable primitive type value is used—but the results it returns in this case should not be considered reliable.

Type.Is(type { number}, type { number }) // don't do this

If all that is important to you is determining whether the type value is of a certain base type, you’re set!

What about more complex type details—what if you want to factor these into your comparison? You’ll need to use the appropriate standard library functions to extract the details of interest and then compare each of them. Unfortunately, the code involved can be a bit complex, in part because there is not always a succinct way to fetch and compare the attributes of interest.

Here’s an example of comparing two types to see whether they both describe tables having columns with the same names in the same order (irrespective of column types).

(typeA as type, typeB as type) as logical =>
    Type.Is(typeA, type table) 
    and Type.Is(typeB, type table) 
    and Type.TableSchema(typeA)[Name] = Type.TableSchema(typeB)[Name]

As you might imagine, the more details you want included in a comparison between type values, the more complex the comparison code gets (ouch!). Maybe one day, the standard library will be expanded with additional functions that help make comparisons like this easier to write up.

The Self Exception

We just learned that it is valid for false to be returned any time type values are compared for equality, with one exception. That exception?

If you compare a type value to itself, the equality check will evaluate to true:

let
  TypeValue = type table [ColA = number]
in
  TypeValue = TypeValue // true

Just remember that each time you use the “type” keyword, you’re creating a new type value. Below, two identically-defined type values are created. While identical in definition, they are nonetheless different values to Power Query, so the “if you compare a type value to itself, the equality check will evaluate to true” exception does not apply.

let
  TypeValueA = type table [ColA = number],
  TypeValueB = type table [ColA = number]
in
  TypeValueA = TypeValueB // not guaranteed to be true

But It Seems to Work!

“But wait,” you might say. “The equality operator actually does work for comparing different type values. I’ve used it before!”

As of this writing, in the versions of Power Query that ship with Microsoft Power BI Desktop and Excel, the following evaluates to true. This may appear to suggest that maybe type equality comparisons actually work the way you’d expect (or not—don’t stop reading).

Value.Type("abc") = type text // true

Surprised?!

Remember that the M language specification does not define rules for determining whether two type values are equivalent. Instead, it allows false to be returned any time type values are compared with the equality operator. However, it does not mandate that false is returned. The spec also gives an M implementation the option for that implementation to use its own rules to perform type equality comparisons.

That’s what’s going on here. In some cases, the Power BI/Excel versions of Power Query will compare type values “the way you’d expect.” The catch is that the rules used aren’t (fully) documented and may change at any time, so cannot be relied on.

(In fact, I’m not positive, but something I read suggested that possibly the reason some type values compare to true is a side effect of an internal optimization that this particular mashup engine implementation currently uses. If true, this doubly underscores the frailty of trying to rely on the “happenstance” that type equality comparisons sometimes “work,” as that optimization could be changed or replaced at any time.)

If a mashup engine chooses to use its own rules for type equality comparisons, the language specification imposes a single requirement: Repeatedly comparing the same type values must produce the same result. In other words, someTypeValue = anotherTypeValue should be consistent in whether it evaluates to true or to false when the same comparison is performed multiple times (assuming that someTypeValue and someTypeValue stay set to the same values).

This rule only applies to repeated comparisons of the same two type values. Each time a query is evaluated, new type values are created. In the language specification, there is no requirement that these new values produce the same equality comparison result you previously saw, even though the new values are defined identically to what was previously compared. (Maybe this time, different optimizations were dynamically applied, so the “happenstance” that resulted in the type equality comparison working last time the way you’d expect no longer applies….)

Since the Power Query language specification does not define authoritative, precise rules for comparing type values, the equality operator is generally useless, as far as comparing type values goes.

Values

In this post’s introduction, we started by encountering what likely seemed strange behavior: When a type is ascribed to a value, the mashup engine ensures that the type’s base type is the same as the value’s base type, and for values with structure, that the type’s structure matches the value’s—but no checking is done on the type’s child component types or names!

Let’s break this down.

First, the base type of the type being ascribed must be the same as the value’s base type. For example, you can ascribe type number or Decimal.Type (both of which are of base type number) onto a numeric value. However, you can’t ascribe that value with type text because it is of a different base type. Similarly, you can’t ascribe a list type onto a table value or a record type onto a date, etc. (This is probably intuitively obvious.)

Value.ReplaceType(1, type number) // ok
Value.ReplaceType(1, Decimal.Type) // ok
Value.ReplaceType(1, type text) // errors

Then, for types that can have multiple child components (records, tables and functions), the mashup engine preforms an addition check when type ascription is attempted: validating that the structure of the type being ascribed matches the structure of the value it is being ascribed to.

For records, this means that the type being ascribed has the same number of fields as the value. For tables, the type and value must have the same number of columns. For functions, the two must both have the same number of required arguments and of optional arguments. So, for example, if a record value has three fields, attempting to ascribe it with a two field (or four field) record type will fail, as will ascribing a one-argument function with a ten-argument function type.

Value.ReplaceType([A=1, B=2, C=3], type [A = number, B = number, C = number]) // no error
Value.ReplaceType([A=1, B=2, C=3], type [A = number, B = number]) // errors
Value.ReplaceType([A=1, B=2, C=3], type [A = number, B = number, C = number, D = number]) // errors

However, ascription does not validate whether the value’s child values are compatible with the new type’s child component types.

You can ascribe type { number } to a list of text values and M won’t complain. Ascribe a table containing a single numeric column with a table type whose column is type logical and M will accept the ascription. Similarly, the engine has no problem ascribing a record with a type whose field types are incompatible with the field’s values.

Value.ReplaceType({ "hi", "bye" }, type { number })
Value.ReplaceType(#table({"Col1"}, {{ 1 }}), type table [Col1 = logical])
Value.ReplaceType([FieldA = #date(2020, 6, 30)], type [FieldA = duration])

All of the above are accepted just fine by the mashup engine.

For functions, not only are the new type’s argument and return value assertions not checked for compatibility, but they also have no effect on the function’s behavior. Below, when the function is invoked, its argument will still be required to be compatible with type number even though type text was ascribed onto it. The argument’s type, as it was originally defined, is all that matters to the engine.

Value.ReplaceType((input as number) => ..., type function (input as text) as any)

Why not?

Why doesn’t the mashup engine fully check ascribed types? Why doesn’t it enforce them? What’s the point of ascribing them if the child types they specify are not checked or enforced?

To start sorting this out, let’s go back to something discussed in Part 16 (Types I – Basics):

Power Query is dynamically typed. All checks on whether a value is compatible with a given type are done by looking at the value’s type at runtime. There’s no concept of a variable having a type nor is type casting relevant—the actual value’s type is what matters.

Consider the following example.

let
  Inner = () as any => ...,
  Outer = (input as anynonnull) => ...
in
  Outer (Inner())

In a statically typed language (C#, Java, etc.), Outer(Inner()) would be invalid because Inner would return values of type any while Outer expects values of type anynonnull. Since any is incompatible with anynonnull, passing Inner‘s output into Outer would be illegal.

With PowerQuery, Inner‘s as any does not change the type of what’s returned to type any. Instead, it simply ensures that the type of whatever value is returned is compatible with any. Similarly, Outer‘s as anynonnull does not limit the inputs that function accepts to only values of type anynonnull but rather to only those values which are compatible with type anynonnull.

Say, Inner returns a value of type number. This value passes Inner‘s return assertion of as any because type number is compatible with type any. It also passes Outer‘s argument assertion of as anynonnull because type number is compatible with anynonnull. On the other hand, if Inner returned a value of type null, it would pass Inner‘s as any assertion but the attempt to pass it into Outer would fail because type null is not compatible with anynonnull.

To repeat:

Power Query is dynamically typed. All checks on whether a value is compatible with a given type are done by looking at the value’s type at runtime. There’s no concept of a variable having a type nor is type casting relevant—the actual value’s type is what matters.

With this in mind, let’s jump back to the question about why M does not validate child types during ascription. Since, in M, type compatibility (conformance) is only checked on a per individual value basis, in order to check for type compatibility, it is necessary to compute the value that needs checking.

Take a list. Say you ascribe it with an item type of number: type list { number }. If M were to ensure that the type of each item in the list is compatible with item type number, each list item’s value would need to be computed. Below, this would involve making three function calls: CallServiceA(), CallServiceB() and CallServiceC().

let
  Value = { CallServiceA(), CallServiceB(), CallServiceC() },
  Ascribed = Value.ReplaceType(Value, list { number }),
  Result = List.Count(Ascribed) 
in
  Result// returns 3

However, M is lazy. When the above expression is executed, M doesn’t bother to call the service functions to get values for the list items. It doesn’t need to. Instead, it skips that unnecessary work because it can determine the count of list items (all it was asked to do) without computing the items’ values.

Instead, say the ask was to sum values of all items except for the first. M’s laziness would lead to it skipping over the first item without evaluating its expression. Service A would not be called because the value it would provide isn’t needed; only services B and C would be queried.

Result = List.Sum(List.Skip(Ascribed, 1)) // value for the first item is not computed because it is not needed

If M were to validate the item type when ascription occurs, the ascription process would need to compute each list item’s value then check whether that value is compatible with the specified item type. All values would need to be computed, even those not otherwise needed to produce the requested output. If Power Query switched to this behavior, it couldn’t be so lazy—which would have performance ramifications.

How about an expression that only returns some rows from a table:

let
  Source = GetData(),
  Ascribed = Value.ReplaceType(Source, type table [Category=text, Code=number, Amount=Currency.Type]),
  Filtered = Table.SelectRows(Ascribed, each [Category] = "Misc")
in
  Filtered 

Say that GetData() is a SQL data source. M will probably query fold this entire expression into something like SELECT … FROM SomeTable WHERE Category = 'Misc';.

If M were to validate the ascribed table column types, it would either need to send an additional query to the database server that somehow asked the server to report whether all values in the table are compatible with the specified types (if the source even supports that kind of a query) or, like the previous example, would have to pull all rows locally and then check their values for type compatibility (in this case, even those rows excluded by the [Category] = "Misc" filter, since the ascription applied to all rows). While technically possible, doing things this way could (potentially significantly!) diminish the benefits offered by M’s query folding.

Power Query’s paradigm doesn’t align well with validating child type compatibility. M’s focus is efficient processing of data received from a plethora of potential data sources that may or may not be super strong in how they type their data. One of the tradeoffs with this approach is that it is impractical for the mashup engine to ensure child type compatibility with ascribed child types.

In some ways, child types are similar to type facets: Neither affect the behavior of the mashup engine. Standard library (and custom) functions can make use of the information they convey and the host application may enforce that whatever data it receives is compatible with thembut mashup engine behavior isn’t affected by them.

Type Assertion and Conformance Checks

All of Power Query’s type assertion and value compatibility (conformance) checking functionality—as, is, Value.As and Value.Is, as well as function argument and return type assertions—only work at a nullable primitive type level. If complex type assertions were allowed and enforced, we’d have similar downsides to those that would be present if child types were validated during ascription. Instead, the language limits its assertion and compatibility checks to nullable primitive types, which it can validate without needing to evaluate list item, record field, table row or table column value expressions.

If you want to compare a value’s type’s facets, structure, child names or child types, or validate child values against those child types, the standard library provides functions you can use to implement a solution that meets the specifics of your requirements, with you balancing the precision desired to resource usage, factoring in the particular characteristics and stability of your data source. M forces you to make conscious choices if comparing or validating these aspects is important to you.

Recap 

To summarize:

  • Ascription: M ensures that the new type’s base type matches the nullable primitive base type of the value and that the new type’s and value’s structures are the same. However, child component types are not validated. (Also, as discussed in the last part, child component details from the new type are matched to the existing value’s child components based on position, not name.)
  • Assertion/Conformance: M only checks whether the value is compatible with the given nullable primitive base type. Structure, child component names and child type(s) are not validated.

What’s the point?

Why ascribe specific child types? Since Power Query allows you to specify these types, there presumably is a reason for their existence, though as we’ve learned, that reason is not to ask M to perform validation. What then are the advantages of specifying them?

For one, how about the host application (Microsoft Power BI, Microsoft Excel, etc.)? M may only be concerned about the types on individual values, but host applications may like to know type information on a per-column basis. A tool like Power BI may decide how to store the values it receives for a particular column based on that column’s data type (and then reject any values that are incompatible with that type). Child types give you a way to communicate relevant information to the host.

Also, while the engine itself doesn’t care about child types, library functions and functions you write can factor them in. For example, below the Total column holding the sum of each group has its column type automatically set to match the type of the column it is summing (Amount). That type is read off the source column (Amount) and ascribed as the return type assertion on the aggregate function (each List.Sum([Amount])). Table.Group reads that assertion and uses it to type the column it creates to hold the aggregation’s results. Try changing Amount‘s type and notice how the aggregate column’s type automatically follows this change.

let
  Source = #table(type table [Category=text, Amount=Currency.Type], {{"Misc", 50}, {"Misc", 25}, {"Part", 2.50}}),
  Group = Table.Group(
    Source, 
    "Category",
    { "Total", Value.ReplaceType(each List.Sum([Amount]), type function (input as any) as Type.TableColumn(Value.Type(Source), "Amount")) }
  )
in
  Group

The previous example shows one case of why you might want to ascribe a new type onto a function. Another is documentation: The descriptions and examples displayed for a function in Query Editor come from metadata attached to its type and child types—and attaching metadata (something we’ll cover in a later post) is done through type ascription.

There’s a few reasons, for starts. Keep in mind that you don’t need to ascribe child types if they’re not helpful in your context. If ascribing them doesn’t give you any benefits, then no need to ascribe.

Conclusion

I hope this journey has made M’s “strange” type behaviors less confusing. Whether or not they are the behaviors you’d like, at least now you know what they are and (hopefully) have an understanding as to some of the reasons why they are what they are.

A key concept here is that child types are just claims. M hasn’t validated them and doesn’t enforce them at the mashup engine level. Even though they are unverified, other code and the host application has the option of choosing to trust them and derive value from them.

While it may not seem robust, M’s approach to type enforcement works rather well for Power Query’s purposes. In fact, increasing its robustness significantly would likely become a hindrance to M’s efficiently working with data. Mostly, the complexities around custom types stay out of the way unless and until you want to involve them and then they give you options which you must wield responsibly.

Whew…we’re done with the type system! Congratulations! Don’t worry, though, we’re not done exploring M just yet. More awaits to delve into!

Next up? Hmm…maybe metadata.

Revision History

  • 2023-01-31: Reworked Type Equality to clarify and align with recent M spec revisions.
  • 2023-02-21: Removed mentions that ascription shouldn’t be used for renaming, as technically that prohibition is incorrect.

2 thoughts on “Power Query M Primer (Part 19): Type System IV – Ascription, Conformance and Equality’s “Strange” Behaviors

  1. Joe Edwards

    Hi Ben,

    Would the following be the most efficient way of comparing all the types and facets of all columns in two tables in a single operation:

    Value.Equals(Table.Schema(Table1), Table.Schema(Table2))

    Chris Webb highlighted many years ago that you can compare the total data content of two tables using Value.Equals (https://blog.crossjoin.co.uk/2014/01/11/comparing-tables-in-power-query/), however I am unsure whether this returns a deterministic answer, or whether it would be down to the engine version to interpret how this is evaluated?

    Thanks,

    Joe

    Reply
    1. Ben Gribaudo Post author

      Hi Joe,

      Thank you for sharing. Very interesting idea!

      Per the language specification, the equality operator (=) will consider two tables equal if: they contain the same number of columns, the columns have the same names, the tables contain the same number of rows and “each row has equal values in corresponding cells.” Based on this, it sounds like Table.Schema(Table1) = Table.Schema(Table2) should work!

      Keep in mind that Table.Schema’s IsWritable and Description columns are from metadata, not type names or type facets. If desired, these two columns can be excluded from the comparison.

      Ben

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *