Exploring Power Query Buffering: How Table.Buffer and List.Buffer Work

, , ,

Table.Buffer and List.Buffer buffer data—but how do they work? How deeply do they buffer? How do they handle errors? And, for that matter, when do they populate?

In a nutshell: Table.Buffer creates a stable copy* of a table’s rows. These rows are fetched only once from the source, regardless of how many times they are read from the buffer. Each time the buffer is accessed, the same* rows are returned in the same order. With List.Buffer, the behavior is identical, except it is list items that are buffered instead of table rows. (*But this may not mean what you think, so keep reading.)

The details have some nuances to them. Let’s explore them, as well as what happens when errors are encountered.

Continue reading

Power BI’s Pseudo-Time & Pseudo-Date Types

, ,

Once upon a time, I was bit by a time column which straightforward equality-comparison DAX wouldn’t filter as expected. From this saga, I learned a couple important lessons about Power BI’s data types that are not intuitively obvious from the user interface.

Imagine some simple DAX like the below. When evaluated, the result I’d receive back was blank—even though rows with the matching time could be plainly seen in Power BI’s table view.

CALCULATE (
  COUNTROWS ( Data ),
  Data[Timestamp] = TIME (12, 30, 0)
)
Table view showing value "12:30:00 PM" twice.

Why this strange behavior? Turns out, Power BI’s UI was not displaying the full data values in the column. When viewed in the UI’s table view, it looked like matching values existed in the column, but actually they didn’t. Comparisons were being performed correctly, just not against the time-only values I thought were there.

Continue reading

Time Values from Dataflow to Power BI via DirectLake

, ,

Seems simple enough: The table, produced by your Power BI gen2 dataflow, contains a time column. You want to output this table to OneLake, then have Power BI read from it using a DirectLake-powered semantic model.

Sounds easy, but not so fast! When you try to set up the dataflow’s output mappings, columns of type time are not supported for OneLake destinations. Ouch!

So, what do you do?

Try leveraging the fact that Power BI’s time values are, behind the scenes, actually datetime values. For proper “type time” behavior, Power BI expects that the date part of each datetime value is set to Power BI’s epoch date (the date it considers day 0, which is 1899-12-30) and it also needs to know that the column is expected to have time behaviors.

Continue reading

Render Tables, Lists, Records -> Text

, ,

Power Query’s simple scalar values—like date, number and logical—can easily be converted to strings. Each has a corresponding type-specific “ToText” method (like Date.ToText or Number.ToText). The generic Text.From can also be used.

But what if you want to render a table, list or record textually? There is no built-in way to convert values of these types directly to text.

However, you can convert them to JSON…and then render that JSON as text!

(input as any) as nullable text =>
  if input = null
  then null
  else Text.FromBinary(Json.FromValue(input))

Handy to render out a complex, nested structure so that you can see all of it at once!

Continue reading

Privacy Levels in Dataflows: Impacted by Staging?

, , ,

Staging data is common in the world of Microsoft Power BI dataflows. Computed entities, linked entities and Microsoft Fabric’s “enable staging” options all result in intermediate output being staged to disk and/or a database.

Does this staging affect privacy levels? Yes! It can change which privacy level is being applied.

Continue reading

Privacy Levels in Dataflows: Click to Continue? (Or Not!)

, , ,

If you’ve built a Power BI dataflow that combines between sources, most likely you’ve been stopped by a prompt asking if you want to “continue” because there is a risk that data could be revealed from one source to another.

Screenshot of prompt:
The evaluation was canceled because combining data from multiple sources may reveal data from one source to another. Click Continue if the possibility of revealing data is okay. 

[Continue button]

The prompt’s wording makes it sound like you must choose “continue” in order to be able to use dataflows to output data derived from more than one data source—but is continuing truly mandatory?

Dataflow's options dialog showing privacy setting "Allow combing data from multiple sources. This could expose sensitive or confidential data to an unauthorized person" checked.

The seeming necessity of enabling this option is reinforced by how the corresponding setting appears in the dataflow’s Options dialog. Clicking “continue” in the above prompt sets this checkbox. Its wording implies that it must be checked in order for Power Query to be able to combine between multiple data sources: If you don’t check it, you won’t be allowed to combine data from more than one source—or so it (incorrectly) seems.

Thankfully, in most cases, you do no need to enable this option in order to combine between sources.  

Continue reading

RowExpression.From/ItemExpression.From

, , , ,

Power Query’s RowExpression.From/ItemExpression.From function (both names reference the same underlying function) provides a way to learn about what a single-parameter function does by outputting an abstract syntax tree (AST) describing it.

Why might you want to use a programmatic structure like an AST to analyze the logic of a function’s body instead of simply invoking the function?

Well, one reason may be that you are implementing query folding in a custom connector. You might want to translate the filter predicate function passed to Table.SelectRows, or the generator function passed to Table.AddColumn, into the upstream data source’s native query/language. In either case, you don’t want to invoke the passed-in function; instead, you want to understand its behavior so that you can factor it in as you build an equivalent native request/query. RowExpression.From/ItemExpression.From is tailored for this purpose.

Unfortunately, this function is little documented—but it is time for that to change!

(Note: For simplicity, the below will refer to this function by the name RowExpression.From. However, ItemExpression.From is an equally valid way to reference the function.)

Continue reading

On-Premises Data Gateway: Not Just for Private Resource Access

, ,

The on-premises data gateway isn’t just for enabling Power BI Service cloud-hosted components to pull data from private local network resources.

True, Microsoft’s short description of the gateway says that it is “a bridge” which “provides quick and secure data transfer between on-premise data, which is data that isn’t in the cloud, and several Microsoft cloud services.” However, beyond that description, there are at least three additional key uses for this gateway which are of particular pertinence in the Power BI world.

Continue reading

Lazy, Streamed, Immutable: Try Building a Table

, , , , ,

Lazy evaluation, streaming and immutability are key Power Query concepts which must be understood to truly grasp how Power Query “thinks.” Want to test your understanding in these foundational areas—and try to grow it—by tackling an assignment?!

Your Task

Code up a single row table containing the following columns:

  • A column of hard-coded data.
  • A couple columns whose data is fetched from an API.
  • A couple columns whose data is fetched from another API.
Example of expected table

Where:

  • Neither API is called if the table’s rows are simply counted.
  • Only the first API is called if the columns containing data from the second API are removed from the table, and vice versa.
  • Each API is called at most once, even when multiple columns that contain data from that API are output.

Without:

  • Using any standard library table functions to build the table (i.e. no “Table.*” functions may be used).
Continue reading