M Language Proposal: Cleaning Up Function Chains with the Pipeline Operator

, , ,

Sometimes, a chain of M function calls reads as a dense blob of code, yet refactoring to the clearer structure of a let statement is an overkill. Let’s look at an alternative, a new operator to consider for inclusion in the M language.

The Problem

You’d like to add a column to your customers table that holds the average amount of the given customer’s three largest completed orders. The needed data is already available in the table, thanks to a nested orders table. All that’s needed is for you to define logic that uses this data to calculate the desired average.

To pull this off, your new column’s logic needs to:

  1. Filter the nested orders table to Status = "Completed".
  2. Sort by Total, descending.
  3. Take the top 3 results.
  4. Average their Totals.

Not too hard to pull off:

Screenshot of following code example inside Query Editor's "Add Column" dialog
List.Average(
  Table.FirstN(
    Table.Sort(
      Table.SelectRows(_[Orders], each [Status] = "Completed"),
      { "Total", Order.Descending }
    ),
    3
  )[Total]
)

From the technical perspective, writing this logic as a chain of function calls works just fine. However, the resulting code is dense. Making sense out of it takes careful reading: first, the reader needs to find the innermost function, which is where the action starts, then work outward one function at a time, being sure to mentally pair the correct parameters with the corresponding function invocation. The flow can be hard to follow, and the parameters to function call pairings easy to get wrong.

Refactoring the above to use a let expression clarifies how the logic reads:

Screenshot of following code example inside Query Editor's "Add Column" dialog
let
  CompletedOrders = Table.SelectRows(_[Orders], each [Status] = "Completed"),
  SortedByTotal = Table.Sort(CompletedOrders, { "Total", Order.Descending }),
  Top3Largest = Table.FirstN(SortedByTotal, 3)[Total],
  AverageOf3Largest = List.Average(Top3Largest)
in
  AverageOf3Largest

But is a let expression really necessary here? The variables it introduces aren’t needed for logic reuse or immutably’s sake. In fact, to use them, names needed to be defined for them, which in a scenario like this could be argued introduces its own sort of clutter.

let expressions and the variable definitions they allow are great in many circumstances. In no way am I suggesting we should make it a general practice to avoid using them. However, in simple function chaining scenarios, sometimes they can be an overkill.

An Alternative

As an alternative to the preceding examples, what do you think of the following?


Screenshot of following code example inside Query Editor's "Add Column" dialog
_[Orders]
|> Table.SelectRows(each [Status] = "Completed"),
|> TableSort({ "Total", Order.Descending }),
|> Table.FirstN(3)[Total],
|> List.Average()

In this not-currently-valid M code, we “borrow” the idea of reverse function application, specifically F#’s pipeline operator,|>” (which is approximately equivalent to Haskell’s Data.Function operator &).

In short, |> takes the output from what comes before it and passes it in as the first argument to the function that comes after it.

So

_[Orders]
|> Table.SelectRows(each [Status] = "Completed")

is equivalent to:

Table.SelectRows(_[Orders], each [Status] = "Completed")

The difference is that, thanks to |>, we can write our chain of function calls in linear, first-to-last order, instead of as a nested chain of invocations or using a let expression!

Variation

Instead of the proposed M pipeline operator passing whatever is on its left as the first argument to the function on its right, it could instead be defined to take whatever is on its left and assign it to a special variable which the expression on the right can reference, if and where it chooses.

If we used “!” as that variable (using “!“strictly for illustrative purposes, not married to it being the variable of choice), this would look like:

_[Orders]
|> Table.SelectRows(!, each [Status] = "Completed"),
|> TableSort(!, { "Total", Order.Descending }),
|> Table.FirstN(!, 3)[Total],
|> List.Average(!)

A Penny for Your Thoughts

What do you think? Would you like the option to use the pipeline operator in M? Again, I’m not suggesting that its use should replace let expressions in general; rather, in the case of simple function chains, it could be a useful construct to have available for crafting easy to read code.

5 thoughts on “M Language Proposal: Cleaning Up Function Chains with the Pipeline Operator

  1. Ignacio

    It would be amazing to have the pipe operator. I love both ideas, with or without “!”. The code is so much cleaner and readable. I wish Microsoft check this post out to think about it 😛
    Thanks for sharing.

    Reply
  2. David

    Compared to other languages, M is very difficult to read. This would help immensely in making it easier to parse. The only thing missing then would be regex support!

    Reply
  3. Kris

    Good call!
    I like the elegance of the R Pipes; %>% in R with magrittr. Changes like this will make me less stubborn to learn another language like M.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *