Tag Archives: Microsoft Excel

Custom Folding Arbitrary Functions: OnInvoke & Table.ViewFunction

January 4, 2022 •

You are working happily away on a Power Query custom data connector (or maybe on a standalone Table.View). Implement OnTake and OnSkip handlers? Check. Implement OnSelectColumns? Check. And on your journey goes, adding functionality by coding up new handlers—that is, until you realize you want to handle folding for a function where there doesn’t seem to be a corresponding handler.

Maybe it’s a standard library function like Table.ApproximateRowCount. Your data source provides a shortcut way to fetch a close-to accurate count and you’d like users of your custom connector to be able to retrieve that count using the standard function that already exists for this purpose. The catch? There’s no publicly documented GetApproximateRowCount handler which you can handle.

Instead, maybe it’s a custom function. Perhaps your data source maintains snapshots of historic data. Your connector users would like to be able to fetch data as it existed at a user-selected point in time by doing something like MySource.AsOf(someTableFromMySource, someDateTime). For this to work efficiently, that method needs to be foldable, but there’s no OnMySourceAsOf handler provided by Microsoft. Are you stuck or is there a way to fold custom methods?

In either case, the challenge is that you want to fold something that doesn’t have a specific handler.

The solution?

Continue reading →

Resilient Relative Column Reordering

October 4, 2021 •

Screenshot showing the "Move > To Beginning" option in Query Editor

There’s this one column you’d always like to appear leftmost in the table. No problem! In Query Editor, you right-click on the column and choose Move > To Beginning, which generates a “Reordered Columns” step for you.

All is well, until down the road when you remove a different, seemingly unrelated column from the table. Your Power Query refreshes start failing, complaining that the removed column is not found.

You dig into the problem and find that the reordered columns step that Query Editor generated included a hard-coded reference to the now-removed column. To get things working again, you must hand edit this step’s M expression, manually removing the problematic column reference.

Why? Why did you need to remove (by hand, nonetheless!) a reference to a column that you didn’t consciously put there—a reference to a column whose position you didn’t ask Query Editor to reorder?

Relative-less (how sad!)

As it turns out, Table.ReorderColumns—the function that powers Query Editor’s reorder columns feature—does not support relative reordering. Conceptually, you wanted a column reordered to be the table’s first column, but Table.ReorderColumns doesn’t provide a way to simply say “make this one column leftmost”. Instead, any column reordering performed in the UI generates a function call to that method where it’s passed a list of all columns in the table, each in their desired order.

#"Reordered Columns" = Table.ReorderColumns(Source,{"ID", "FirstName", "LastName"})

If one of these columns is later removed, Query Editor doesn’t automatically update the passed in column list, so your code breaks. Ouch! In contrast, adding a new column to the table doesn’t cause Table.ReorderColumns to fail, but this doesn’t mean the experience is painless: the presence of the new column may bump the column you wanted leftmost out of that position.

It would be nice to eliminate these pain points.

Make Do…

Table.ReorderColumns has an optional third argument which can be set to MissingField.Ignore. This suppresses the missing column name error, which keeps the function working even though the column is gone. While this works, it leaves the deleted column’s name in code (code clutter = undesirable). It also doesn’t guarantee that the column you want on the left will stay there when new columns are added to the table.

Surely there’s a better way to do relative reordering….

…Or, Do It Nice!

Let’s see. The pain point is the hard-coded column list that’s passed to Table.ReorderColumns. We’re M code writers. Why don’t we use code to dynamically compute that list and perform the reorder?! We could craft a function that takes a list of just those columns we want leftmost, which then dynamically fetches the table’s current column list and adjusts its order appropriately before passing the result to Table.ReorderColumns.

Something like the below (which includes the bonus feature of also supporting rightmost relative ordering):

let
  Function = 
    (data as table, columnsToOrderLeft as list, optional columnsToOrderRight as list) as table => 
    let
      CurrentOrder = Table.ColumnNames(data),
      ReorderLeft = columnsToOrderLeft,
      ReorderRight = columnsToOrderRight ?? {},
      OrderedColumnsRemoved = List.RemoveItems(CurrentOrder, ReorderLeft & ReorderRight),
      NewOrdering = ReorderLeft & OrderedColumnsRemoved & ReorderRight,
      Reordered = Table.ReorderColumns(data, NewOrdering)
    in
      Reordered,
  FunctionType = 
    type function 
      (
          data as table,
          columnsToOrderLeft as (type {text}), 
          optional columnsToOrderRight as (type {text})
      ) 
      as table
      meta [
        Documentation.Name = "TableRelativeReorderColumns", 
        Documentation.LongDescription = "Returns a table from the input <code>table</code>, with the columns in <code>columnsToOrderLeft</code> appearing leftmost in the order given and the columns in <code>columnsToOrderRight</code> appearing rightmost in the order given. Other columns will not be reordered."
      ],
  Ascribed = Value.ReplaceType(Function, FunctionType)
in
  Ascribed

No more need for a hardcoded list of all column names. No more code clutter when MissingField.Ignore is used and a column is removed. Columns stay in the expected relative order even when new columns are added.

let
  TableRelativeReorderColumns = (code from above),
  Source = ...,
  Reordered = TableRelativeReorderColumns(Source, {"ID"})
in
  Reordered

Hope this helps!

Custom Navigation Property Name Generators

September 29, 2021 •

Several data connectors allow you to control the names assigned to relationship columns. Defining a custom relationship column naming format is easy. Ensuring that the generated names do not conflict with existing column names is trickier. Let’s look at how to do both.

Continue reading →

Relationship Columns and Their Names

September 24, 2021 •

Have you ever stopped to think about relationship columns: how they work, when they’re automatically added, and in particular how they’re named?

On that last point: Did you know there is a latent danger where seemingly unrelated changes can break existing M code?

What Is a Relationship Column?

In a nutshell, a relationship column is an automatically added nested join between the table you’re working with and a related table. In the relationship column, for each row, there’s a nested table containing the associated rows from the related table. Thanks to M’s laziness, if the nested join isn’t used, fetching the related table’s row data will be skipped—so the presence of a relationship column whose values are unneeded does not incur an appreciable cost.

Continue reading →

M Language Proposal: Cleaning Up Function Chains with the Pipeline Operator

September 2, 2021 •

Sometimes, a chain of M function calls reads as a dense blob of code, yet refactoring to the clearer structure of a let statement is an overkill. Let’s look at an alternative, a new operator to consider for inclusion in the M language.

The Problem

You’d like to add a column to your customers table that holds the average amount of the given customer’s three largest completed orders. The needed data is already available in the table, thanks to a nested orders table. All that’s needed is for you to define logic that uses this data to calculate the desired average.

To pull this off, your new column’s logic needs to:

Filter the nested orders table to Status = "Completed".
Sort by Total, descending.
Take the top 3 results.
Average their Totals.

Continue reading →

Power Query M Primer (Part 22): Identifier Scope II – Controlling the Global Environment, Closures

September 1, 2021 •

As we learned last time, normally, M code is evaluated in a global identifier resolution scope consisting of all shared members + the standard library. Also, normally, we can’t inject additional identifiers into this global environment. Normally isn’t always. Today, we learn about the exception: where both of these normalities do not apply.

That’s not all: Did you know that M has a mechanism for remembering how to access variables that later go out of scope? Closures open up powerful options, particularly when generating functions…and even enable building an object-like programmatic construct that maintains internal private state and is interacted with through a public interface (kind-of, sort-of somewhat like an object from object-oriented programming!).

Continue reading →

M Mysteries: The Mysterious Type Action—An M-Internal Means to Write Data Modifications to External Systems

August 19, 2021 •

Power Query is great for reading, combining and computing data, but it’s not meant for writing data modifications—like inserts, updates or deletes—back to the source. Correct?

Yes and no.

What?!

If you are a non-internal user, then yes, Power Query is intended to be read only: it does not expose functionality meant for inserting, updating or deleting data on remote systems. But this doesn’t mean Power Query lacks this capability: to the contrary, it has hidden, internal support for performing data modifications!

Continue reading →

Dynamic, Lazy Records

August 9, 2021 •

In the below record, when is Amount‘s value calculated?

[
  FieldA = …,
  Amount = ExpensiveToCompute("some", "arguments"),
  …
]

Only if needed. Why? Power Query’s record field values are lazily evaluated. A field’s expression is only evaluated if its value is needed. If it’s not, the cost of computing the value isn’t expended. Nice!

Let’s say, instead, you’d like to dynamically add Amount to an existing record. Is something like the following effectively equivalent to the above?

let
  SomeExistingRecord = [
    FieldA = …,
    …
  ]
in
  Record.AddField(
    SomeExistingRecord, 
    "Amount", 
    ExpensiveToCompute("some", "arguments")
  )

No! Whoa! Amount‘s laziness went good bye! Above, ExpensiveToCompute("some", "arguments") is executed whether or not Amount‘s value is ever needed.

Continue reading →

Equals Is Not Always Equivalent: Power Query Joins vs. SQL Joins

August 6, 2021 •

Take the following M expression:

Table.Join(A, "ID", B, "ID", JoinKind.Left)

Does it behave like the below SQL (which is how a join between two tables on column ID would typically be coded in the database world)?

FROM A
  LEFT JOIN B ON A.ID = B.ID

Perhaps surprisingly, no—at least, not when the simple, innocent null is involved.

Continue reading →

Equals Is Not Always Equivalent: When Query Folding Does Not Produce Identical Results

July 23, 2021 •

Query folding is supposed to be transparent, as far as results go. Whether or not a Power Query expression is folded should have no effect on the data returned. You should receive back identical results either way. At least, that’s the theory.

Unfortunately, this is not always the case!

The fact that query folding sometimes changes the results that are returned can bite unexpectantly. You have an M expression that produces exactly what you want. Then you make what should be an innocuous edit, but behind the scenes the change affects whether or how the query is folded. The results you now receive back are no longer what you expect, and puzzlingly the divergence seems to have no obvious relation to your edit. Or, maybe you didn’t edit anything at all: instead, a Power Query update changed the foldability of your query without you touching it. You made no changes, yet the data returned is now different.

Continue reading →