The on-premises data gateway isn’t just for enabling Power BI Service cloud-hosted components to pull data from private local network resources.
True, Microsoft’s short description of the gateway says that it is “a bridge” which “provides quick and secure data transfer between on-premise data, which is data that isn’t in the cloud, and several Microsoft cloud services.” However, beyond that description, there are at least three additional key uses for this gateway which are of particular pertinence in the Power BI world.
Lazy evaluation, streaming and immutability are key Power Query concepts which must be understood to truly grasp how Power Query “thinks.” Want to test your understanding in these foundational areas—and try to grow it—by tackling an assignment?!
Your Task
Code up a single row table containing the following columns:
A column of hard-coded data.
A couple columns whose data is fetched from an API.
A couple columns whose data is fetched from another API.
Where:
Neither API is called if the table’s rows are simply counted.
Only the first API is called if the columns containing data from the second API are removed from the table, and vice versa.
Each API is called at most once, even when multiple columns that contain data from that API are output.
Without:
Using any standard library table functions to build the table (i.e. no “Table.*” functions may be used).
If you work in the advanced realm of ascribing types, there are a couple interesting behaviors to be aware of related to table column renames (including a bug!).
Positional, Not Name-Based
Imagine you want to set column type claims on a table, so you create a query that uses Value.ReplaceType to ascribe an appropriate new table type.
// BaseDataSet
let
// in real life, comes from a database
Data = #table(
{"Amount", "TransactionID"},
{
{100.25, 1},
{32.99, 2}
}
)
in
Data
// MyNicerTable
let
Source = BaseDataSet,
Result = Value.ReplaceType(Source, type table [Amount = Currency.Type, TransactionID = Int64.Type])
in
Result
So far, so good.
Later on, someone decides that the ID column should be moved to be leftmost, so they reorder columns by editing BaseDataSet. However, they don’t touch your MyNicerTable query with its Value.ReplaceType code. Look closely at what that expression now outputs:
The column that contains transaction IDs is now named “Amount” and typed Currency.Type. Similarly, “Amount” values now show up under the column name “TransactionID” which is typed as whole number. Ouch!
Numbers and dates are formatted differently in different parts of the world. How are these cultural differences handled in the realm of Power Query? Turns out, arguably, there can be not just one—but two—sets of rules in play.
In the M language, numbers and date/time-based values are natively stored in culture-agnostic formats. It doesn’t matter what part of the world you’re in or how it formats values, when #date(2023, 1, 23) is evaluated, Power Query understands that the referenced year is 2023, the month is 1 and the day is the 23rd—and it maintains this understanding throughout the value’s lifetime. Similar holds true with the other date/time types, as well as with numbers.
On the other hand, when converting values of these types to or from text, culture does come into play—but which culture?
Number.ToText(123456.78, "n")
// outputs:
// 123,456.78 (if the culture is en-US)
// 123.456,78 (if the culture is es-ES)
// 123 456,78 (if the culture is se-SE)
The task seemed simple enough: Determine the earliest date between two columns, so that this value could be used when generating a date dimension.
A little M code that took the minimum of each column, then the minimum of those minimums seemed to meet the need until—ouch—I discovered it wasn’t finding the earliest date. Why?
let
OrderCreatedMin = List.Min(SomeTable[OrderCreated]),
TransactionReceivedMin = List.Min(SomeTable[TransactionReceived]),
OverallMin = Date.From(List.Min({ OrderCreatedMin, TransactionReceivedMin }))
in
OverallMin
What do you think? Do you see any problems with the above?
During 2022, the Power QueryM Language Specification received seven substantiative revisions (beyond typo fixes, formatting tweaks, and the such). Each brought clarification to ambiguous points or corrected cases where the specification did not align with actual mashup engine behavior. None of the revisions added new language functionality or otherwise resulted in the mashup engine changing.
Interestingly, while the M language gained two new features last year (try‘s catch and structured error messages), neither of these has yet to make it into the language spec.
To the average Power Query user, how the standard library and data connectors end up in the global environment may be irrelevant. What matters is that, however they get there, they’re there and they work! But in the world of advanced M development, how identifiers come to be injected directly into the global environment becomes interesting. Of particular pertinence is the extension/module system that plays a pivotal role in part of this process.
Welcome to a new world: extending the global environment, here we come!
Ever had a syntax error in a .pqm file? Likely, the error’s message described the problem but did not tell you which .pqm file contained the error. Instead, you may have had to peruse the various recently-edited .pqm files in your connector to figure out which one the problem originated from.
Thankfully, there is a way to improve this experience, so that extension module load errors identify the problematic source file!
A challenge with the current Power Query PostgreSQL data connector is that it does not understand how to work with PostgreSQL’s complex typed columns.
Background
Some examples of PostgreSQL’s complex typed columns:
Scenario: PostgreSQL allows a single column to be defined as containing a specific complex type. Example: Column phone_number could be defined to as being of type telephone_number (which has fields “areacode”, “number” and “extension”). M Equivalent: Column contains a record.
Scenario: A single column can also be defined to contain an array of values of a specific scalar type. Example: Column visit_dates could be defined as an array of date values. M Equivalent: Column contains a list.
Scenario: A column can be configured to contain an array of a given complex type. Example: Column order_lines could be defined as an array of order_line values. M Equivalent: Column contains a nested table.
Challenge
Power Query does not understand any of these more advanced column set ups. Depending on the exact scenario, when PQ encounters one of the above, it either renders out the raw textual equivalent of the entire column (like “{}” in column project_contingency_items, below) or an error (such as the below complaint that “We don’t support CLR type ‘System.Dynamic.ExpandoObject’.”).
The net effect is that, for a table/view where these more complex structures are used, hand-written SQL (e.g. Value.NativeQuery) may be necessary as Power Query may be unable to make “sense” out of the relevant columns on its own. This makes the level of effort involved with ingesting this data into PQ much higher.
You’re authoring in Power Query. You decide that rows with 0001-01-01 in a certain column should be removed, so you filter on the column, excluding 0001-01-01 values. After you apply your filter, nothing changes: the 0001-01-01 rows are still present. What is going on?
This issue bit me recently. Turns out, it’s due to a translation loss when bridging between the worlds of MySQL and Power Query.