“Handling joins doesn’t look too bad,” you think as you examine the handler’s signature. You’re building out a Table.View
(perhaps as part of a Power Query custom data connector), expanding its query folding capabilities by implementing additional handlers. OnJoin is up next.
OnJoin = (joinSide, leftTable, rightTable, joinKeys, joinKind) => ...
Its arguments look straightforward, mostly a mirror of what’s passed to Table.Join
. Only one parameter, joinSide, is unfamiliar, but is easy to figure out.
All is well, that is, until you start writing code. You’re stopped by a brick wall.
Imagine two tables from your data source being joined together. To fold the join, Power Query invokes OnJoin on one of the tables’ views (say, it chooses the view for table A). The arguments passed to handler OnJoin give it access to the other table (in this case, a reference to table B), but what’s needed is something more: access to the internal state of the other table’s view (the state stored inside table B’s view).
Why?
When GetRows is invoked on the view that handles the OnJoin, you want a single query sent to the external source telling it to join the two tables together. In our example, this query will come from view A, since it is handling the join. B’s view won’t be used to fetch B’s data, as view A has assumed this responsibility.
As part of fulfilling this responsibility, the native query generated by A needs to appropriately represent any folding that’s been configured on B. Assuming that view B stores its folding details in an internal state record (a typical pattern), this means that view A needs to factor in B’s internal state when generating the native query.
A code example might help.
let
Source = MySource.Connect(),
TableA = Source{[Name="A"]}[Data],
TableB = Source{[Name="B"]}[Data],
TopB = Table.FirstN(TableB, 100),
Result = Table.Join(TableA, "A.ID", TopB, "B.ID")
in
Result
Prior to the join, the internal state for B’s view (i.e. TopB) might look something like: [Take = 100]
This internal state needs to be factored into the native query generated by view A. However, when Power Query calls A’s OnJoin handler, it passes it a reference to table B, not the internal state record from B’s view. How does view A get access to view B’s state so it can factor it into the query it generates?
View Functions to the Rescue
Remember view functions? Using Table.ViewFunction
, a function can be registered as a view function, making it eligible for query folding by table views. When Power Query encounters a view function being applied to a table and deems the situation appropriate for folding, it invokes the associated view’s OnInvoke handler. In turn, this handler can take appropriate action, such as returning a new view or data of some kind.
View functions can be used to provide custom foldable methods that you share with others (like MyData.AsOf
from the linked article’s example). However, view functions are not limited to shared functions: non-shared functions can also be registered.
What if:
- We create a function which views A and B can see but which consumer code can’t see (no reason for others to see it, as it is intended for internal use only)…
- …and we register it as view function…
- …then build table B’s view so that its OnInvoke returns B’s internal state record when called to handle our view function…
- …then have view A’s OnJoin call this function on the reference it receives to table B?
With this function, view A can use the table B reference its OnJoin handler receives to fetch view B’s internal state. At the same time, access to this state isn’t exposed to normal consumers, preserving the state’s privacy.
A view function is being used to privately tunnel state between views. Yay!
Example
Below shows an internal view function, its use by view A and its handling by B’s OnInvoke.
To illustrate the inter-view interactions more clearly, the two views are created by separate functions instead of using a single generalized function (as would be preferable in real life).
section MySource;
MySource = ...;
MySource.Publish = ...;
[DataSource.Kind="MySource", Publish="MySource.Publish"]
shared MySource.Connect = () =>
let
BaseNavTable =
#table(
{"Name", "Data", "ItemKind", "IsLeaf" },
{
{ "A", GetViewA([]), "Table", true },
{ "B", GetViewB([]), "Table", true }
}
)
in
Table.ToNavigationTable(BaseNavTable, {"Name"}, "Name", "Data", "ItemKind", "ItemKind", "IsLeaf");
GetViewA = (state as record) =>
Table.View(
null,
[
GetType = () => Value.Type(GetRows()),
GetRows = () =>
if state[Join]? <> null
then // this view has handled a join, so return simulated joined data
#table(
{"A.ID", "A.Value", "B.ID", "B.Value"},
{{1, "Joe", 1, "hello"}, {2, "State Details", 2, Helpers[ValueToText](state)}}
)
else // this view has *not* handled a join, so just return table A data
#table(
{"A.ID", "A.Value"},
{{1, "Joe"}, {2, "Paul"}}
),
OnJoin = (joinSide, leftTable, rightTable, joinKeys, joinKind) =>
let
OtherTable =
if joinSide = JoinSide.Left then rightTable
else if joinSide = JoinSide.Right then leftTable
else error "Unexpected joinSide value",
OtherTableState = try InternalState(OtherTable)
in
if OtherTableState[HasError]
then error OtherTableState[Error]
else @GetViewA(state & [Join = [OtherVieweState = OtherTableState[Value]]])
]
);
GetViewB = (state as record) =>
Table.View(
null,
[
GetType = () => Value.Type(GetRows()),
GetRows = () =>
#table(
{"B.ID", "B.Value"},
{{1, "hello"}, {2, "world"}}
),
OnTake = (count as number) => @GetViewB(state & [Take = count]),
OnInvoke = (func, args, index) =>
if func = InternalState
then state
else ...
]
);
InternalState =
Table.ViewFunction((input as table) => error "Called when not expected.");
/*
Assumes that:
* https://raw.githubusercontent.com/Microsoft/DataConnectors/master/samples/TripPin/8-Diagnostics/Diagnostics.pqm is included as section member "Helpers"
* Table.ToNavigationTable is present (from https://docs.microsoft.com/en-us/power-query/helperfunctions#tabletonavigationtable)
*/
When used with our introductory example, the following is output:
Observations
- B’s GetRows is ignored when A handles the join between A and B. Notice that the two views are hard-coded to return different values for B’s data. Since A accepted the OnJoin folding request, Power Query relies on A’s view to return the appropriate data for both tables A and B when performing the join.
- InternalState raises an error if its actual method body is run. This will occur if it is called on a table other than one directly powered by B’s view, as only that view’s OnInvoke handles this method (and so bypasses the method’s normal behavior). Also, if A’s OnJoin encounters an error when it makes its call to InternalState, it will re-raise it—which signals Power Query that A declines the folding request, which will lead to Power Query internally performing the join. The net effect is that A will only take responsibility for a join if it is to table B.