Type Equality

, , , ,

Type equality is an advanced—and also confusing—Power Query topic. Sometimes, equality comparisons between type values seem to work as intuitively expected; other times, they may not. What’s going on? Is it okay—is it safe—to use the equals operator between type values?

But first: Why the confusion around this subject? The M language specification lays out a brief set of rules for type equality which define the minimum expected behavior. These may then be augmented by additional rules specific to the Power Query implementation you are using. Grasping the full implications of the former isn’t always intuitively obvious and there is almost nothing publicly written documenting the latter—resulting in Power Query type equality being a confusing, little understood advanced niche.

Let’s try to clear things up. Let’s explore the behaviors of Microsoft’s flagship Power Query mashup engine implementation (i.e. the Power Query that ships with Microsoft Power BI and Excel). We’ll examine both behaviors mandated by the M spec (specifically, the relevant subsections of “Types” and “Operators“) and additional ones that are specific to the flagship PQ implementation.

Let’s get started!

Self-Equality

A type value is always equal to itself, per the language spec.

For example:

let
  a = type table [Col1 = text]
in
  a = a // true

To be clear, this behavior does not mean that two equivalently-defined type values will equal (more on this in a bit) but rather that the the exact same type value compared to itself is considered equal.

How might we get back the same type value multiple times (e.g. so we can compare it to itself)? Power Query provides several ways, including:

Define an Identifier

If, say, we store a type value in an identifier, we can reference that identifier as many times as we want and each time we will get back the same value. We saw this in the above example where a = a evaluated to true. Why? Both the left and right arguments to the equality operator reference the same value (the value of identifier a), and a type value compared to itself is considered equal.

Type Context Keywords

The M language defines a set of keywords which return type values when they are used in a type context. Effectively, these type name keywords are identifiers and so, like other identifiers, return the same value each time they are evaluated.

Below, type switches the mashup engine to interpreting the next word in type context, and text in type context is a type name keyword. This keyword is effectively an identifier that returns a type value that describes values of type text.

type text

The following evaluates to true because both type null expressions produce the same value, and—as we already learned—comparing a type value to itself evaluates to true:

type null = type null // true

Predefined *.Type Identifier from the Standard Library

The standard library defines a number of identifiers with names in the form of SomeName.Type (such as Decimal.Type and Uri.Type). These are just identifiers that return type values, so we can reference a given *.Type identifier as many times as we want, and we’ll get back the same type value each time. This means that the following will evaluate to true:

Int64.Type = Int64.Type //true

Putting this knowledge into practice, say you want to determine whether a type value is the same exact value as Decimal.Type. The following comparison will give you the answer:

SomeValue = Decimal.Type // true if SomeValue is the same value as Decimal.Type; otherwise, false

For some of these *.Type identifiers, there’s additional, special relationship….

Synonym Type Values

Each type name keyword defined by the M language specification has a *.Type identifier synonym in the standard library.

Take a type name keyword, title case it, then append “.Type”—and you have the standard library alias for it. For example:

type datetimezone -> DateTimeZone.Type

As synonyms, these standard library identifiers return the exact same values as their corresponding type name keywords. Based on this, all of the following evaluate to true because each is comparing the same type value to itself.

type duration = Duration.Type // true
type time = Time.Type // true
type logical = Logical.Type // true

Not Quite Identical, But Equal

In Power Query, a type value always equals itself—but this isn’t the only time were type values may equal. There are several situations where type values which are not (exactly) the same are considered equal.

Metadata Ignored

Equality comparisons ignore metadata, per the M language spec.

You could, in a sense, think of metadata as being wrapped around an underlying value. This wrapper is ignored for purposes of equality comparisons (and, for that matter, when most other operators are applied). Instead, just the underlying values (the values without metadata) are compared.

In the context of type values, this means that if you compare a type value to the same type value but with metadata added, the values will equal. Similarly, if both sides of the comparison are the same underlying type value, except that each is “wrapped” with different metadata, the comparison will evaluate to true. In both cases, the values themselves are not the same (they can’t be, because they differ in metadata) but the underlying values are the same—and the latter is what the equality operation considers.

let
  a = type table [Col1 = text],
  b = Value.ReplaceMetadata(a, [Important = true])
in
  a = b // true
let
  a = Value.ReplaceMetadata(type list, [Sorted = true]),
  b = Value.ReplaceMetadata(type list, [Ordered = true])
in
  a = b // true

Type Facets Ignored

Similarly, when comparing type values, type facets are ignored. (This behavior isn’t required by the M language spec, but instead is an extra behavior added by the flagship Power Query implementation.)

let
  a = type number,
  b = Type.ReplaceFacets(a, [NativeTypeName = "NUMERIC"])
in
  a = b // true

Similar to metadata, it may help to think of type facets as a wrapper around an underlying type value. For purposes of equality comparisons, the facet “wrapper” is ignored; instead, the underlying type values are compared. Since, above, the underlying values are one and the same, the comparison evaluates to true.

Nullable Types

Two values describing nullable types are considered equal if their underlying non-nullable type values are considered equal. (Again, this is another mashup engine-specific behavior, not one laid out in the language specification.)

let
  t = type table [Col1 = any],
  Left = type nullable t,
  Right = type nullable t
in
  Left = Right // true

Above, type values Left and Right are both built on the same underlying type value. When the equality operator is applied, the mashup engine first checks whether the two type values are nullable. As they are, it then proceeds to check whether the non-nullable type values they are based on are equal. Since those underlying values are one and the same (i.e. both are the value of t), the expression evaluates to true.

Equivalently-Defined Types

So far, these type equality behaviors probably make sense. Now for one that may be surprising: Comparing two different type values that are equivalent in their definition will evaluate to false. In other words, equivalently-defined type values are not considered equal.

The examples we’ve looked at so far have focused on comparing the same value (or the same value, after ignoring any wrapping metadata and type facets; or, when both are nullable, the same underlying value) to itself. However, it is possible for different values to describe the same type.

Consider the difference between the following:

type text
type table [Col1 = text]

The first is simply a reference to type context keyword text. As it is effectively an identifier reference, each time it is evaluated, the same type value will be returned.

On the other hand, the second type context expression contains more than a simple type keyword reference; instead, it consists of a more elaborate expression. Evaluating it multiple times will return a different value each evaluation. These different values all describe the same type (in this case, a table containing a single column, named “Col1”, whose contents are compatible with type text) but each will be a distinct (different) type value.

Below, while a and b are equivalently-defined (that is, they are type values that can be used interchangeably to describe the same set of values), a and b are not the same type value.

let
  a = type table [Col1 = text],
  b = type table [Col1 = text]
…

Since the values are different, the “a type value is equal to itself” rule does not apply—so what happens when an equality comparison is performed?

Well, since neither the M language specification nor the flagship mashup engine implementation defines rules for comparing equivalent type values, the comparison returns false.

let
  a = type table [Col1 = text],
  b = type table [Col1 = text]
in 
  a = b // false

If the idea of non-identical but equivalent type values is fuzzy, perhaps a non-type value example will help: Consider the following expression which defines two different strings. One is stored in variable a and the other in b. The two strings are different values, but at the same time they are equivalently defined values. They are two separate (distinct) values but which are each defined identically—two different strings but both with identical content.

In the a = b comparison, a and b are not self-equal, as a and b are different values. However, the comparison still evaluates to true because M has rules that specify how to compare two different string values to see if they should be considered equal—and in this case, for obvious reasons, they should be.

let
  a = "hello",
  b = "hello"
in 
  a = b // true

Unlike strings, for type values, there are no rules defined for comparing two different type values to see if they are equal. After ignoring metadata and type facets, if the two arguments to the equality operator are not the same underlying type value, the comparison will evaluate to false. This is true even if the two values are identical in how they were defined.

Now that we understand equivalently-defined type values better, let’s review what we discussed near the start of this subsection on how we can create such values: In a type context, a singular reference to a type context keyword (e.g. type binary) always returns the same type value each time it is evaluated. On the other hand, a more complex type context expression (such as type { number }) returns a different, equivalently-defined value each time it is evaluated.

But there is an exception:

Alias Type Expressions

If a type can be described using a single type context keyword and, alternately, a more complex type context expression that strictly uses type context-specific literal syntax (e.g. an expression that does not switch out of type context, does not reference a non-type context identifier, does not define a function, etc.), the more complex expression will consistently return the same value as the singular type keyword. (This is one of those extra behaviors added by the mashup engine, not a rule mandated by the language spec.)

For example, a list type’s default item type is any, so type list and the more complex type { any } are equivalent. Since this more complex type expression can be equivalently expressed as a simple type context keyword, both expression will evaluate to the same value.

Applying this rule, in each of the following pairings, the two expressions will evaluate to the same value. The one is, in essence, a shortcut for the other.

type list
type { any }

type record
type [...]

type table
type table [...]

This means that comparisons like the following are self-comparisons and so will evaluate to true, as both the left and right inputs to the comparison are the same value:

type list = type { any } // true
type record = type [...] // true
type table = type table [...] // true

Additionally, in some cases, this “alias” behavior holds true when the more complex expression is not defined strictly using type context-specific literal syntax—such as when function invocations and ordinary identifier references are used. In such cases, the more complex expression might still evaluate to the same value as its singular keyword cousin. For example, in each pairing below, both sides of the equality expression evaluate to the same value (and so, of course, are considered equal):

Type.OpenRecord(type []) = type record // true
Type.ForRecord([], true) = type record // true
let a = type any in type { (a) } = type list // true
let RowType = type [...] in type table RowType = type table // true

But this isn’t guaranteed to be true for all more complex expressions that go beyond strict type context-specific literal syntax. Some don’t share the above behavior but instead return a different type value each evaluation (i.e. they exhibit the same behavior as an ordinary more complex type expression).

Below, the types described on the left and right of the equality comparison are equivalent. However, the expression on the left evaluates to a different value each time it is evaluated, not the same, consistent value that type list returns.

type { Any.Type } = type list // false

Before we leave the topic, keep in mind that this behavior of a more complex expression returning the same type value as its single type keyword cousin isn’t mandated by the M language specification. Future changes to the mashup engine could result in it happening in even more cases, or not happening at all.

The Other .Type Values

A few minutes ago, we talked about how the standard library includes a number of “alias” identifiers, like Number.Type, which each return the same value as the corresponding type name keyword (e.g. Number.Type returns the same value as type number)—because, in a sense, these standard library identifiers are aliases for the corresponding type keyword.

But what about the other .Type values in the standard library—those whose names do not follow this pattern? For example, there are no types in the M language named “int64”, “currency” or “percentage”, yet the standard library includes Int64.Type, Currency.Type and Percentage.Type.

With what we’ve covered thus far, we are now ready to understand these non-alias .Type values. Let’s start by making a couple observations and considering what they infer.

Observation #1: First, taking the three example .Type identifiers just articulated, while their names suggest they are all number-related, their values are not the same as the value returned by the expression type number, nor are any of their values the same as any of the other identifiers’ values.

Int64.Type = type number // false
Currency.Type = type number // false
Percentage.Type = type number // false
Int64.Type = Currency.Type // false
Currency.Type = Percentage.Type // false
Percentage.Type = Int64.Type // false

A type value compared to itself would equal true—and none of the values returned are equal to any of the  others. This tells us that Currency.Type is a different value from Int64.Type, which isn’t the same value as Percentage.Type, etc.—and all of these are different values from what type number returns.

Observation #2: Yet at the same time, we can ascribe any of these type values to numeric values…

Value.ReplaceType(1, Int64.Type) // works
Value.ReplaceType(1, Currency.Type) // works
Value.ReplaceType(1, Percentage.Type) // works

…and all of them (doesn’t matter which) can be used to identify that a numeric value is a numeric value.

Value.Is(1, Int64.Type) // true
Value.Is(1, Currency.Type) // true
Value.Is(1, Percentage.Type) // true

The mashup engine considers all of these different type values to be functionally equivalent in terms of classifying values: each can be used interchangeably to classify numeric values.

Functionally Equivalent, But Distinguishable

Putting the preceding two observations together, we know that:

  • Functionally, these type values can all be used to describe values of type number (the concept).
  • But, at the same time, the other three type values are not the same as the value returned by the expression type number (the syntax).

Let’s think through this:

In Power Query, the value returned by type number (that literal syntax) is not the only type value that can describe values of type number (the concept). As we saw a moment ago, the other just-mentioned .Type identifiers also return values that describe values of type number (the concept).

These other values are not identical to type number (that is, not identical to the value returned when that literal syntax is evaluated) but they are bi-compatible with the value returned by that syntax. That is, they are each distinct type values—but at the same time, they all represent the same concept and so can be used interchangeably from the functionality perspective of the mashup engine. They are equivalently-defined types.

The mashup engine doesn’t care which equivalently-defined type you use in type ascription. You can interchangeably ascribe type number or Int64.Type or Decimal.Type or Float.Type (etc.) to a number value because they all describe values of type number (the concept). Similar also holds true when testing types for compatibility: type number, Int64.Type, Percentage.Type and any other type value describing type number (the concept) can be used interchangeably with functions like Value.Is and, I believe, Type.Is (though, arguably, the latter’s documentation is fuzzy on this point). Equivalently-defined type values are functionally equivalent from the mashup engine’s perspective—they will all be treated the same and behave the same, regardless of which equivalently-defined type value is used.

(“But,” you say, “there must be a functionality difference, because applying Table.TransformColumnTypes specifying Int64.Type or Currency.Type (etc.) validates values.” Not so fast! While there’s isn’t room here to delve into the details, the behavior you’re describing comes from the function Table.TransformColumnTypes, not from anything in the M language or the core mashup engine’s logic.)

At the same time, the fact that these bi-compatible type values are different matters from the equality perspective. The ability to differentiate between these distinct type values is something you (or tooling others write) can take advantage of.

Why Differentiate?

Imagine that you want to give the host application—say, Power BI—information about the kind of numbers you are placing in a particular table column. This information might be helpful to Power BI because it might store or format a number column differently if it knows it will only contain integers, or if the values in it all represent percentages, and so forth. To the mashup engine, such differentiation between sub-classifications of numbers is a foreign concept—as, to it, all numeric values are simply of type number (the concept). Yet, to a tool like Power BI, communicating such a “sub-kind” hint or “sub-type” claim could be interesting and useful. 

How might you pass such extra information to Power BI? What if you and Power BI had a mutual understanding that one identically-defined type value that describes numbers would be used when you want to signal it to expect that a column’s values will be integers, another identically-defined type value will be used to indicate that the column contain percentages, etc.?

That’s, in essence, what you are doing when you use a standard library identifier like Int64.Type or Currency.Type or Percentage.Type. All of the aforesaid return values describing type number (the concept) so can be used interchangeably from the mashup engine’s perspective. However, Power BI has an understanding that when it sees a column type identical to the value the standard library calls Int64.Type, it should expect the column to contain only integers (or, more precisely, values that fall within the range of what can be stored in a 64-bit integer); if the column type is Percentage.Type, it should expect the column to contain values that represent percentages (with 1 = 100%); and so forth.

Don’t let the names confuse you. In a name like Int64.Type, the “.Type” part does not convey that a new type is being added to the M language; rather “.Type” simply conveys that the identifier returns a type value. The “Int64” part of the name states the mutual understanding of what using that particular type value implies—the kind of sub-kind claim or sub-type hint that using the given .Type value is meant to convey. What the host application does (if anything) with that claim is up to it; these standard library identifiers simply provide a host-agnostic way to communicate such details.

Not Just for Number Values

Our examples have focused on only a handful of numeric sub-type claim values. The standard library includes a number of other claim values for numbers. It also provides sub-type claim values for a few other types, such as type text (whose available sub-type claims include Character.Type, Guid.Type, Password.Type and Uri.Type).

An exercise for the reader is to take the output of #shared, filter it down to just *.Type names, and then determine which ones are equivalent values.

Conclusion

Yes, type value equality comparisons work in the flagship implementation of Power Query! They serve the purpose of allowing you to determine whether two type values are one and the same, enabling you to differentiate between identical and equivalently-defined type values. In practical terms, differentiating between such values is primarily useful in determining which sub-type claim is being conveyed.

Outside of this niche, if you find yourself tempted to apply the equality operator between type values, you might do will to pause and ask whether what you’re really wanting to do is determine type equality or type compatibility—that is, are you trying to check whether a particular type value is the same as another type value (equality), or whether what a particular type classifies is included in what the other type classifies (compatibility)? In case of the latter, instead of the equals operator, you probably should be looking at functions such as Value.Is, Value.As, Type.Is or language keywords as and is.

Our discussion has been based around how Microsoft’s flagship mashup engine implementation works. However, in the hypothetical, it is possible that someone could create a mashup engine implementation that works differently. The M spec allows an implementation broad discretion in defining its own rules for type equality, so it would be valid for such a hypothetical mashup engine to consider equivalently-defined types as equal. Such a change is very, very unlikely to be made to the current flagship mashup engine anytime soon, as considering equivalently-defined type values to be equal would break how sub-type claims are currently communicated.

Much Thanks: A conversation with the Power Query team’s Eric Gorelik covering a number of type-related details about how the flagship mashup engine works greatly helped me better understand this topic. Thank you, Eric!

Leave a Reply

Your email address will not be published. Required fields are marked *