Data Mashup Explorer allows you to explore the Power Query mashups and supporting details contained in a Microsoft Excel or Microsoft Power BI file. Primarily, this tool is useful for learning about the little known but non-secret file format used by these Microsoft applications to save mashups.
A notification pops up on your system: you have a new, red exclamation point, high-priority email. Before you can even read it, your phone rings. It’s your boss. “I just emailed you. We just discovered that column X in our database contains restricted PII that we can’t be storing. The DBAs are deleting the column as we speak. Please be sure it is gone from reports and your system immediately!” You open Microsoft Power BI Desktop, hit refresh, and the column disappears from the dataset because has been removed from the server; then you save the file. But is column really gone, or could there be remnants of its data left on your system?
A prospective client asks you to look at a PBIX file. When you’re done, to minimize liability you don’t want someone else’s data or credentials left sitting on your system, so you delete the file. Are its contents and associated credentials really all gone?
A coworker needs advice on a report, handing it to you on a thumb drive, with the file marked read-only. You open the PBIX file directly from the external drive, give your thoughts then close the file without saving it. Since the drive was external and the file read only, the file was contained exclusively on the flash drive—or was it?
Whether for security purposes, regulatory compliance’s sake, good business practices or liability management, having a sense for whether and, if so, how widely, Power BI spreads report data on your system is a good thing.
In query editor, ever notice that the column data type menu includes four options for numbers: Decimal Number, Currency, Whole Number and Percentage? In this series, we’ve only talked about one numeric type: type number. Are there types we’ve missed?
Behind the scenes, menu item Decimal Number maps to type number, Currency to Currency.Type, Whole Number to Int64.Type and Percentage to Percentage.Type. If you look at the names defined in your Power Query environment, you’ll likely see a host of other “Type” names, including Int8.Type, Int16.Type, Int32.Type, Single.Type and Double.Type. What are all these “Type” names—even more types we have yet to cover?!
Power Query’s type system helps us understand the values we works with, allows us to describe the kinds of data expected by functions we write, offers a means to share documentation (which IntelliSense can display) and provides a mechanism to inform consumers about the structure of the data we’re handing them (which the host environment can use—for example, guiding Power BI to correctly set column types).
To sum up M’s approach to types at a high level:
Every value has a type. A type is itself a value. Types are used to classify values.
A simple statement, but with a lot packed into it and behind it—so much, in fact, that we’ll explore Power Query’s type system in over several posts. Today, we’ll start with the basics, centered around what are known as nullable primitive types. Later, as we get deeper in, hold on to your hat—you might find a major puzzling surprise, where the type system doesn’t work the way you’d expect.
Let’s start delving into the type system by examining the summary statement we read a moment ago.
Your Power Query is skipping merrily along its mashup way. Then, bam! Something bad happens! Uh oh! What do you do when an error raises its ugly head? Or, for that matter, what if code you write detects an anomaly and you want to announce this fact in an informative manner?
Thankfully, M has error handling capabilities, allowing you to both raise and handle runtime errors. We’ll learn how to do both.
Important: If you’re familiar with the idea of an exception from other programming languages, Power Query’s error handling is different in at least one significant respect from what you may be familiar with.
Nope. That’s not a typo in the title. In the Power Query world, there aren’t control structures (plural); there’s just one control structure (singular). We’re about to examine its simplicity. As to the “missing” control structures (which you may be used to from other programming languages), we’ll explore ways of implementing similar functionality the M way.
Last time, we began exploring how Power Query “thinks” about tables, delving more deeply into streaming and query folding. This time, we’ll continue building our understanding of how tables are processed by learning about keys, native query result caching and the data protection layer (firewall). We’ll also explore why native queries may be executed more times than you might expect.
The goal between these two posts is to equip you with a better understanding of the context in which your mashups are executed—knowledge you can use to author more efficient M queries, avoid unexpected data changes during processing and keep the data protection layer (firewall) happy.
Why should you concern yourself with how Power Query “thinks” about tables? After all, you write an expression that outputs the table you want, the mashup engine executes it and everyone is happy without you having to think about how the engine does its thing…right? Yes—at least until you encounter performance problems, values change during processing or a firewall error bites—then what do you do?
Understanding how M processes tables is an important asset in developing efficient mashups, avoiding unexpected data variability and keeping the data privacy layer happy. Streaming, query folding, buffering, table keys, native query caching and the firewall—all of these relate to how the interpreter thinks about/processes/handles tables.
There’s so much to cover, we’ll split the list in two. Let’s tackle the first half (streaming, query folding and buffering) in this post and save the remainder (table keys, native query caching and the firewall) for next time.
If you’re like most users, tables are the main reason you use Power Query. You ask Power Query to pull data from one or more sources, mash it up and then return the results in a table which is then handed off to the host application (Microsoft Excel, Power BI, SSAS, SSIS, etc.).
As you might expect, with tables being of such fundamental importance, there’s plenty we could explore. A whole series could be written about the standard library’s table functions. While that would be fun to do, the focus of this series is the M language, not the library, so things related to the language are what we’ll delve into here.
The previous post in this series concluded with the thought that tables are similar in behavior to lists and records but go beyond what we’d get if we tried to build our own simulated tables using those two types. Let’s start by looking at a way in which table and list are similar….
You might be inclined to skip learning about list and record and jump directly to table. After all, as far as types that hold other values go, you mostly work with table. A table’s data is made up of simple values, so it makes since that you learned about them (parts 6, 7, 8 and 9). However, list and record might feel like types you won’t normally use, so why not skip them?
Turns out, a table behaves in ways like both a listand a record. If you want to leverage all that table offers, learning about these types is foundational. Besides, as you work with M, you just might find that list and record are more useful in their own right than you first thought.
Oh, did I mention that there’s a surprising relationship between let expressions and type record? All the more reason not to skip over it!