Power Query M Primer (Part 25): Extending the Global Environment

, ,

To the average Power Query user, how the standard library and data connectors end up in the global environment may be irrelevant. What matters is that, however they get there, they’re there and they work! But in the world of advanced M development, how identifiers come to be injected directly into the global environment becomes interesting. Of particular pertinence is the extension/module system that plays a pivotal role in part of this process.

Welcome to a new world: extending the global environment, here we come!

Series Index

The Global Environment

First, on the global environment…a recap of some salient facts:

Screenshot showing "queries" list in Query Editor

In Query Editor, the top-level expressions that are defined in the pane to the left are called “queries”—at least, that’s what the UI calls them. Behind the scenes, in the technical realm, each of these “queries” is actually a section member. All of these section members are contained in a section document that Microsoft arbitrarily names “Section1” (for more on sections, see part 21 from this series).

Section members can be marked as “shared”. Each shared section member is added to M’s global environment, so appears in #shared. (In Query Editor, the option to share is not exposed but instead sharing is managed automatically, so whether or not a section member is shared is generally transparent to the average M code author.)

In addition to shared section members, the global environment also can contain identifiers that are directly injected into it (so not shared into the global environment from section documents, but rather directly placed in it by the mashup engine). A principal example is the Power Query standard library, which Microsoft directly adds into M’s global environment.

While the M language specification allows the mashup engine to add identifiers directly into the global environment, it does not lay out the specifics on how these identifiers are to be defined and registered in order for that to happen. Instead, these details are left to the creativity of the mashup engine author.

Your Very Own Mashup Engine

To help us understand how extensions work (and some of the whys behind the hows), let’s pretend that you are building your own mashup engine from scratch. That is, using the M language specification as your guide, you are writing code in another programming language (say, C#—but ultimately, the choice is up to you) that will parse and process M code.

As part of your implementation, you’ll need to decide how you will get identifiers directly into the global environment—because the language spec. leaves this “how” up to you to determine.

Hmm…how will you make this happen?

Methods from Another Language?!

For starts, you might create a mechanism enabling methods written in your mashup engine’s implementation language to be registered so that they are available as functions in M’s global environment.

In the global environment (for example, in #shared), these methods will appear indistinguishable from the other methods in that environment. Users can invoke these functions from the M expressions they write without being conscious that they are written in another language. When invoked, these methods will run in-process with the mashup engine (and so avoid the significant performance cost involved with “jumping” to an external language environment—which occurs when Power Query’s Python and R integrations are used).

But why…why might you want to author M-invokable global methods using a language other than M?

For one, the M language does not directly include a mechanism to communicate with things external to the mashup engine. Using only what the language specification defines, you can’t write code that reads from disk or makes network calls because M doesn’t provide any built-in way to perform these activities. (Imagine how limited Power Query’s usefulness would be if it couldn’t call to external sources!) Instead, the ability to communicate with the outside world must come from functionality that is made available to the M world by code written outside of M.

You also might want to use the implementation’s programming language to leverage existing libraries that are available for it. For example, say you want to enable users of your M world to connect to Microsoft SQL Server. It would vastly simplify your work if your SQL data connector could be built on top of the existing .Net Microsoft.Data.SqlClient NuGet package instead of you needing to recreate that library’s functionality in M code.

There’s also performance: The compilers, interpreters and libraries in the ecosystem of a language like C# have years of effort vested in making them fast. By writing performance-critical parts of your core standard library in such a language, you can leverage these optimization advantages without paying the cost of implementing these tunings internally in your mashup engine. (While building them into your engine might be nice, it may be impractical, at least at first, to do so. Imagine how large of a mashup engine development team you’d need—and how many years would be required—to reimplement what C# or a similar ecosystem offers you out of the box.)

Now, to the real world: Microsoft has written part of their Power Query standard library in C#, for reasons such as the above. To you and me, looking in from the outside, the C#-powered methods in the global environment are indistinguishable from those written natively in M.

Exactly how Microsoft’s implementation interfaces between M’s world and C# is an internal detail. Microsoft might register a list of C# methods directly with the mashup engine (like we imagined above). Alternately, they could achieve a similar net effect using different means, such as by adding an internal-only M language feature which allows M code to call out and invoke methods in C# libraries (imagine M code like #invokeexernal("SqlClient.Dll", SqlCommand.Construct, args)). With this, Microsoft could then define all their standard library methods using M, but with some of those M methods simply being lightweight “shims” that use this “invoke external” feature to run functionality contained in external, implementation-language libraries.

While the behind-the-scenes details may be interesting, knowing their specifics doesn’t have much practical value to those of us on the outside, as we aren’t allowed to use C# functions directly from M.

Nevertheless, if you were building your own mashup engine implementation, you’d need to provision some way for M expressions to invoke logic written in another language (either your implementation language or another language that can run in-process with your mashup engine).

Sharing M Methods

Back to your hypothetical mashup engine…back to imagining how your global environment wire-up would work….

While a means to bring methods written in another language into the global environment is needful and import, you almost certainly will want to define other global functions using M expressions. How might you make this possible?

In short, you’d be looking to define a set of names (i.e. global identifier names) paired with M expressions. Name + expression pairs…hum…that sounds like the essence of what constitutes a section document. Why not use a section document as the mechanism for defining your global M expressions?! (After all, your mashup engine already needs to know how to process section documents, so reusing that concept here, for the global environment, would save you implementation work.)

As you define the M code portion of your standard library, you might want to group related definitions together into separate files for organizational purposes. Instead of having everything in one massively long file, putting text manipulation functions in one file, number-related functions in another, and so forth, would help keep things neat and tidy. This would make it easier to version control code, make it easier for diverse teams to independently work on different portions of the library, etc., etc.

To this end, you decide to allow the M portion of your global environment to be defined using a collection files, each containing a section document.

The real world: To external audiences, Microsoft provides a single mechanism for adding identifiers directly into the global environment: Power Query modules, which contain M code that extends the global environment. These extensions are defined using section documents.

Simply create a section document containing the logic you want included in your extension, with the section members that should be added to the global environment marked as shared. Place this section document in the correct location (optionally, after packaging it into a special container file). If necessary, configure the host application appropriately. Voilà! You have an extension that is recognized by Power Query!

// NeatStuff.pq
section NeatStuff;

shared NeatStuff.SayHi = (firstName as text) as text => 
  "Hi " & firstName;
Screenshot showing data extension security options

(The specifics of extension packaging and installation options is beyond the scope of this Primer. For a simple way to follow the examples given here using Microsoft Power BI Desktop: Save your extension’s section document in a file whose name ends with .pq [like NeatStuff.pq], then place this file in %userprofile%\Documents\Power BI Desktop\Custom Connectors. Next, in Power BI’s options, under Security > Data Extension enable, at least temporarily, the option: “(Not Recommended) Allow any extension to load without validation or warning.”)

With the extension in place, all shared section members defined in it are automatically added to the M global environment (the module you just defined extended the global environment!). You can reference these by name from M code you write in Query Editor—just like you would reference the functions that Microsoft ships with their standard library.

// In a "blank query" in Query Editor
let
  Name = "Joe",
  Result = NeatStuff.SayHi(Name)
in
  Result // returns "Hi Joe"

You may have noticed that the extension you created a moment ago prefixed its method’s name with the section document’s name followed by a period: the section was named NeatStuff; the method was named NeatStuff.SayHi, not simply SayHi.

There is no technical mandate requiring this prefixing, but it is a common extension convention. Judicious adherence to this practice vastly reduces the chances of a name conflict occurring between shared section members from different extensions.

Isolated Evaluation

Back to designing your imaginary mashup engine: You’re using section documents for the M-coded contributions to your global environment. Following Microsoft’s example, only shared section members from extension section documents go into the global environment. This means that non-shared section members in an extension’s section document are kind of like private methods.

Great! Extension authors can use non-shared section members to allow different pieces of an extension to share “internal only” logic without exposing these helper bits to consumer code (i.e. code written by end users in Query Editor).

At least, that sounds like a good theory. True, based on what we’ve designed so far, only shared section members end up in the global environment—but remember that, in M, non-shared section members can still be accessed using #sections and via SectionName!MemberName-style references. So, simply not sharing a section member doesn’t make it truly private.

To achieve true privacy, why not have your mashup engine hide the sections that define extensions from consumer M code? The shared section members from extensions should be added to the global environment, but that’s it—the section documents they come from shouldn’t show up in #sections nor should those sections be referenceable by name from consumer M code (so no ExtensionSectionName!MemberName-style references allowed!).

In fact, taking it a step further, why not altogether isolate each extension? After all, extensions shouldn’t have the right to intrude on the private affairs of other extensions (so shouldn’t be able to see other extensions’ sections) and probably shouldn’t be allowed to take dependencies on consumer code (so shouldn’t see the consumer Section1).

Real world: This is exactly how Power Query works.

Shared section members from extensions are added into the consumer global environment, but the section documents defining those extensions are not made visible to that environment.

// NeatStuff.pq
section NeatStuff;

shared NeatStuff.SayHi = (firstName as text) as text => 
  "Hi " & FormatName(firstName);

FormatName = (name as text) as text =>
  Text.Proper(name);

When the above extension is in place, code written in Query Editor can see NeatStuff.SayHi (for example, code end users write can call this method; it also appears in #shared). However, FormatName is invisible to the consumer global environment, as is the section itself (so NeatStuff does not show up in #sections, and NeatStuff!Something-style references won’t work).

Query Editor screenshot showing error raised in response to attempting 'NeatStuff!FormatName("Joe")'

From inside the extension, the extension’s own section document is visible (e.g. if code inside the extension evaluates #shared, it will see its own section listed), as well as any special sections the mashup engine decided to make visible to extensions (like maybe a special Extensibility section that isn’t available to consumer code). However, the extension cannot see the consumer code Section1 or the sections of other extensions.

Does this “hide the sections” rule deviate from the M language specification? Not at all. The specification allows the mashup environment to directly inject identifiers into the global environment. How the mashup engine comes up with those identifiers is its business. To do so, if it evaluates some section documents somewhere in isolation, that’s the mashup engine’s internal business. These separate extension section documents are not a part of the section document set the end user is asking the mashup engine to evaluate, so there is no need for the extension sections to be exposed to the consumer environment. Where the global environment’s contents come from is irrelevant to the consumer.

Extension Only Global Methods

Continuing your imaginary design: In it, you are now at the point where you have the ability to define extensions (i.e. modules) written in M using section documents, which are evaluated in isolation, with only their shared members, well, #shared to the consumer M global environment.

What you have could be considered a complete solution…but it might be useful to give extensions access to extra functionality beyond what the regular consumer standard library provides.

For example, an extension defining a custom data connector might need a way to fetch the raw password that the user configured for it. You wouldn’t want normal consumer M code to be able to read out raw passwords (imagine the security risk!), so there’s no need for consumer M code to even know that a method for doing this exists. However, having an Extension.CurrentCredential() function available inside extensions could be quite handy.

How could you pull this off? Why not have the mashup engine add a set of extension-specific functions (and, possibly, other identifiers) to the global environment that extensions see—but that aren’t present in the global environment for consumer M code?

Real world: You’ve probably already guessed that Microsoft does what was just described.

You can see the extra identifiers (mostly methods) that Microsoft exposes to extensions by defining an extension with a method that returns the #shared it sees from its perspective:

// ExtensionEnvironmentViewer.pq
section ExtensionEnvironmentViewer;

shared ExtensionEnvironmentViewer.Shared = () => 
  #shared;

Next, from consumer Power Query (i.e. from Query Editor), run #shared in that context, then compare its output to the #shared that the extension sees.

// In a "blank query" in Query Editor
let
  PublicShared = #shared,
  ExtensionShared = ExtensionEnvironmentViewer.Shared(),
  IdentifiersJustInExtensionShared = List.RemoveItems(
      Record.FieldNames(ExtensionShared), Record.FieldNames(PublicShared)
    ),
  Sorted = List.Sort(IdentifiersJustInExtensionShared)
in
  Sorted
First part of extension-context-only identifiers list

The output may vary based on the version of Power Query in play, but will consist of the list of global identifiers that are only available for use from extensions.

Thanks to this extension-specific “perspective” on the global environment, Microsoft can give extensions access to functionality that it wouldn’t make sense for consumer code to touch.

Multi-Persona

Back to the drawing board for your hypothetical mashup engine: The design you have come up with is almost a complete solution—but there is an additional nuance to consider factoring in….

You might want code in an extension to behave differently depending on the fine details of its context.

Imagine that you frequently find yourself calling a certain web API. This involves building an appropriate query string, making the API call, then converting the response from JSON into a table. You’ve build a helper function to handle these steps.

// In a "blank query" in Query Editor
(entityName as text) as table =>
let
  BaseUrl = "https://api.example/v1/",
  Response = Web.Contents(BaseUrl, [Query = [entity = entityName]]),
  AsJson = Json.Document(Response)
in
  Table.FromRecords(AsJson)

While you can copy-and-paste this function into the various Power BI projects where it is needed, this duplicates code. To avoid this repetition, you decide instead to package your GetData function into an extension. This way, your helper function’s definition lives in one place, where it can be referenced from your various Power Query projects.

// MyApi.pq
section MyApi;

shared MyApi.GetData = (entityName as text) as table =>
  let
    BaseUrl = "https://api.example/v1/",
    Response = Web.Contents(BaseUrl, [Query = [entity = entityName]]),
    AsJson = Json.Document(Response)
  in
    Table.FromRecords(AsJson);

While this “get data” function lives in, shall we call it, extension context (literally, in the context of an extension), it behaves exactly the same as it would if you instead had defined it directly in the consumer context of Query Editor.

“Behaves exactly the same” holds true not just for this example function, but (to my knowledge) in general for all code that is simply put into an extension. True, each extension is evaluated in isolation and the standard library visible in its context may not contain exactly the same identifiers as the consumer context (remember those extension-only global methods?). However, to my knowledge, the functions in the standard library that are visible in both consumer context and in plain extension context behave the same, regardless of which context they are used from.

But is this “works exactly the same” behavior always want you want? Perhaps some code in extensions should run differently or have special privileges. Maybe you don’t want MyApi.GetData to behave the same as it did when it lived in consumer context. Instead, maybe you want it to take on special powers!

Take the workhouse Web.Contents function: In real life Power Query, code running in consumer context cannot override Web.Contents‘ default behavior of raising an error when an HTTP 401 (Unauthorized) or 403 (Forbidden) response is encountered. In contrast, when invoked by a data connector function in an extension, Web.Contents‘ behavior changes to allow this error raising to be suppressed by setting its ManulStatusHandling option.

In your hypothetical design, to mimic this sort of behavior, you’d need code in extension context to work normally by default—but you’d also need a way to signal that certain extension code should run with elevated privileges. You might even have different sets of special privileges available for different situations.

To be clear, simply putting code in an extension shouldn’t cause its behaviors to change. In the case of MyApi.GetData, if your motivation for placing it in an extension is simply distribution convenience, then you (almost certainly) want it to continue to behave exactly as it did when its definition lived in the consumer context of Query Editor.

On the other hand, if you want it to take on the “super powers” of a data connector, how would you signal the mashup engine to elevate the method into that enhanced context, and how would other methods figure out when “super powers” apply so they can adapt their behaviors accordingly?

Let’s see if we can glean insight into this by examining what Microsoft does….

Real World: To those of us on the outside, Microsoft exposes a single entry point to the world of enhanced extension contexts: Methods inside an extension may be registered as being part of a “data connector kind”. This grants them the associated special privileges of being data source methods.

To so register an extension method, tag it with the literal attribute DataSource.Kind. Set this attribute’s value to the name of a section member containing a record that describes the data source kind.

With the highlighted changes below, MyApi.GetData operates with the super powers of a data connector method!

// MyApi.pq
section MyApi;

[DataSource.Kind="MyApi"]
shared MyApi.GetData = (entityName as text) as table =>
  let
    BaseUrl = "https://api.example/v1/",
    Response = Web.Contents(BaseUrl, [Query = [entity = entityName]]),
    AsJson = Json.Document(Response)
  in
    Table.FromRecords(AsJson);
	
MyApi = [Authentication = [Implicit = []]];

(Above, the data source kind is defined simply as using anonymous authentication. As you might guess, more data source definition options are available. However, exploring theses is beyond the scope of this Primer. For more on data source kinds and custom connectors in general, see Microsoft’s custom connector documentation.)

How does Microsoft implement these “super powers” behind the scenes? I don’t know. To my knowledge, they haven’t publicly shared these details. I do have a guess based on some hints gleaned from how extensions work; how far or close this is to the truth remains to be seen.

I imagine that perhaps the mashup engine internally holds a list of currently applicable permissions. When evaluation starts for a “should have special permissions” method (like the data source method MyApi.GetData), the special permissions that are relevant to it are added to this internal list and remain there until the method returns.

As MyApi.GetData‘s evaluation continues, if and when code that should vary its behavior based on effective permissions is run, that code checks whether or not the permission of interest is currently applied and then adapts its behavior accordingly.

If this hypothesis is true, then when evaluation of MyApi.GetData begins, the “data connector” permission is added to the currently in effect permissions list. When the method invokes Web.Contents, internally Web.Contents checks (if relevant) the current permissions to determine whether or not manual handling of HTTP statuses 401 and 403 should be allowed—something like:

if HasPermssion(Permissions.DataConnector)
then /* allow manual status handling of 401 & 403 */ 
else /* do not allow 401 & 403 to be manually status handled */

I also suspect that multiple permissions can be simultaneously applicable. For example, a special data connector function might be given data connector as well as native query privileges or maybe type action privileges.

Again, the above is just an slightly educated guess. The details of how super powers are implemented is internal to Microsoft.

To emphasize the fact that special privileges stay in effect until the privilege-attributed method returns, and so apply not just to that specific method, but also to the methods that it invokes, consider this variation of the above extension:

// MyApi.pq
section MyApi;

[DataSource.Kind="MyApi"]
shared MyApi.GetData = (entityName as text) as table =>
  MyApi.NotADataSourceMethod(entityName);
	
shared MyApi.NotADataSourceMethod = (entityName as text) as table =>
  let
    BaseUrl = "https://api.example/v1/",
    Response = Web.Contents(BaseUrl, [Query = [entity = entityName]]),
    AsJson = Json.Document(Response)
  in
    Table.FromValue(AsJson);
	
MyApi = [Authentication = [Implicit = []]];

MyApi.NotADataSourceMethod is not a data connector function, so if consumer code invokes it directly, its Web.Contents call will not be run with data connector powers.

However, if instead consumer code invokes MyApi.GetData, data connector privileges will be applied. When that method invokes MyApi.NotADataSourceMethod, those special privileges will still be in effect, and so will apply to that method’s invocation of Web.Contents.

To my knowledge, there aren’t very many methods that change behavior based on current permissions. Web.Contents is the main example from the standard library. Also, a number of the special “only available in extension context” methods (like Extension.CurrentCredential()) adapt their behavior based on permissions: typically, these methods will raise an error if invoked from plain extension context, only producing useful output if an appropriate permissions-enhanced context is in play.

Conclusion

We’ve encountered extensions/modules, played with permissions and gained a bit of knowledge about how global identifiers end up in the consumer global environment.

Much of what we’ve explored is undocumented. As such, this Primer part’s contents were primarily derived from research, so should be taken with the disclaimer that there could be things I missed or misinterpreted. Also, I can’t say whether the terminology I used aligns with how Microsoft internally refers to the equivalent concepts (for example, officially is it “enhanced extension context,” “augmented extension context,” or something else?). If someone from Microsoft would like to offer insight or clarifications in any of these areas, I’d be most grateful.

At any rate, hopefully now you have a much better understanding of how the global environment is put together, and learned a bit about extensions, as well.

Next time, it is probably time to wrap this series up. Maybe we should do an in-depth walk through of what happens when you ask Power Query to evaluate an expression? We’ll see. Until then, happy M coding!

Leave a Reply

Your email address will not be published. Required fields are marked *