Staging data is common in the world of Microsoft Power BI dataflows. Computed entities, linked entities and Microsoft Fabric’s “enable staging” options all result in intermediate output being staged to disk and/or a database.
Does this staging affect privacy levels? Yes! It can change which privacy level is being applied.
If you’ve built a Power BI dataflow that combines between sources, most likely you’ve been stopped by a prompt asking if you want to “continue” because there is a risk that data could be revealed from one source to another.
The prompt’s wording makes it sound like you must choose “continue” in order to be able to use dataflows to output data derived from more than one data source—but is continuing truly mandatory?
The seeming necessity of enabling this option is reinforced by how the corresponding setting appears in the dataflow’s Options dialog. Clicking “continue” in the above prompt sets this checkbox. Its wording implies that it must be checked in order for Power Query to be able to combine between multiple data sources: If you don’t check it, you won’t be allowed to combine data from more than one source—or so it (incorrectly) seems.
Thankfully, in most cases, you do no need to enable this option in order to combine between sources.
Power Query’s RowExpression.From/ItemExpression.From function (both names reference the same underlying function) provides a way to learn about what a single-parameter function does by outputting an abstract syntax tree (AST) describing it.
Why might you want to use a programmatic structure like an AST to analyze the logic of a function’s body instead of simply invoking the function?
Well, one reason may be that you are implementing query folding in a custom connector. You might want to translate the filter predicate function passed to Table.SelectRows, or the generator function passed to Table.AddColumn, into the upstream data source’s native query/language. In either case, you don’t want to invoke the passed-in function; instead, you want to understand its behavior so that you can factor it in as you build an equivalent native request/query. RowExpression.From/ItemExpression.From is tailored for this purpose.
Unfortunately, this function is little documented—but it is time for that to change!
(Note: For simplicity, the below will refer to this function by the name RowExpression.From. However, ItemExpression.From is an equally valid way to reference the function.)
The on-premises data gateway isn’t just for enabling Power BI Service cloud-hosted components to pull data from private local network resources.
True, Microsoft’s short description of the gateway says that it is “a bridge” which “provides quick and secure data transfer between on-premise data, which is data that isn’t in the cloud, and several Microsoft cloud services.” However, beyond that description, there are at least three additional key uses for this gateway which are of particular pertinence in the Power BI world.
Lazy evaluation, streaming and immutability are key Power Query concepts which must be understood to truly grasp how Power Query “thinks.” Want to test your understanding in these foundational areas—and try to grow it—by tackling an assignment?!
Your Task
Code up a single row table containing the following columns:
A column of hard-coded data.
A couple columns whose data is fetched from an API.
A couple columns whose data is fetched from another API.
Where:
Neither API is called if the table’s rows are simply counted.
Only the first API is called if the columns containing data from the second API are removed from the table, and vice versa.
Each API is called at most once, even when multiple columns that contain data from that API are output.
Without:
Using any standard library table functions to build the table (i.e. no “Table.*” functions may be used).
If you work in the advanced realm of ascribing types, there are a couple interesting behaviors to be aware of related to table column renames (including a bug!).
Positional, Not Name-Based
Imagine you want to set column type claims on a table, so you create a query that uses Value.ReplaceType to ascribe an appropriate new table type.
// BaseDataSet
let
// in real life, comes from a database
Data = #table(
{"Amount", "TransactionID"},
{
{100.25, 1},
{32.99, 2}
}
)
in
Data
// MyNicerTable
let
Source = BaseDataSet,
Result = Value.ReplaceType(Source, type table [Amount = Currency.Type, TransactionID = Int64.Type])
in
Result
So far, so good.
Later on, someone decides that the ID column should be moved to be leftmost, so they reorder columns by editing BaseDataSet. However, they don’t touch your MyNicerTable query with its Value.ReplaceType code. Look closely at what that expression now outputs:
The column that contains transaction IDs is now named “Amount” and typed Currency.Type. Similarly, “Amount” values now show up under the column name “TransactionID” which is typed as whole number. Ouch!
Numbers and dates are formatted differently in different parts of the world. How are these cultural differences handled in the realm of Power Query? Turns out, arguably, there can be not just one—but two—sets of rules in play.
In the M language, numbers and date/time-based values are natively stored in culture-agnostic formats. It doesn’t matter what part of the world you’re in or how it formats values, when #date(2023, 1, 23) is evaluated, Power Query understands that the referenced year is 2023, the month is 1 and the day is the 23rd—and it maintains this understanding throughout the value’s lifetime. Similar holds true with the other date/time types, as well as with numbers.
On the other hand, when converting values of these types to or from text, culture does come into play—but which culture?
Number.ToText(123456.78, "n")
// outputs:
// 123,456.78 (if the culture is en-US)
// 123.456,78 (if the culture is es-ES)
// 123 456,78 (if the culture is se-SE)
The task seemed simple enough: Determine the earliest date between two columns, so that this value could be used when generating a date dimension.
A little M code that took the minimum of each column, then the minimum of those minimums seemed to meet the need until—ouch—I discovered it wasn’t finding the earliest date. Why?
let
OrderCreatedMin = List.Min(SomeTable[OrderCreated]),
TransactionReceivedMin = List.Min(SomeTable[TransactionReceived]),
OverallMin = Date.From(List.Min({ OrderCreatedMin, TransactionReceivedMin }))
in
OverallMin
What do you think? Do you see any problems with the above?
During 2022, the Power QueryM Language Specification received seven substantiative revisions (beyond typo fixes, formatting tweaks, and the such). Each brought clarification to ambiguous points or corrected cases where the specification did not align with actual mashup engine behavior. None of the revisions added new language functionality or otherwise resulted in the mashup engine changing.
Interestingly, while the M language gained two new features last year (try‘s catch and structured error messages), neither of these has yet to make it into the language spec.
To the average Power Query user, how the standard library and data connectors end up in the global environment may be irrelevant. What matters is that, however they get there, they’re there and they work! But in the world of advanced M development, how identifiers come to be injected directly into the global environment becomes interesting. Of particular pertinence is the extension/module system that plays a pivotal role in part of this process.
Welcome to a new world: extending the global environment, here we come!