Tag Archives: Microsoft Excel

Power Query PostgreSQL Data Connector Feature Request: Support “Complex” Columns

, , , ,

A challenge with the current Power Query PostgreSQL data connector is that it does not understand how to work with PostgreSQL’s complex typed columns.

Background

Some examples of PostgreSQL’s complex typed columns:

Scenario: PostgreSQL allows a single column to be defined as containing a specific complex type.
Example: Column phone_number could be defined to as being of type telephone_number (which has fields “areacode”, “number” and “extension”).
M Equivalent: Column contains a record.

Scenario: A single column can also be defined to contain an array of values of a specific scalar type.
Example: Column visit_dates could be defined as an array of date values.
M Equivalent: Column contains a list.

Scenario: A column can be configured to contain an array of a given complex type.
Example: Column order_lines could be defined as an array of order_line values.
M Equivalent: Column contains a nested table.

Challenge

Power Query does not understand any of these more advanced column set ups. Depending on the exact scenario, when PQ encounters one of the above, it either renders out the raw textual equivalent of the entire column (like “{}” in column project_contingency_items, below) or an error (such as the below complaint that “We don’t support CLR type ‘System.Dynamic.ExpandoObject’.”).

Query Editor output showcasing a lack of support for PostgreSQL's complex data types

The net effect is that, for a table/view where these more complex structures are used, hand-written SQL (e.g. Value.NativeQuery) may be necessary as Power Query may be unable to make “sense” out of the relevant columns on its own. This makes the level of effort involved with ingesting this data into PQ much higher.

Continue reading

MySQL’s Invalid Dates + Power Query = No Effect Query Folding?!

, , ,
Results display showing 0001-01-01 value

You’re authoring in Power Query. You decide that rows with 0001-01-01 in a certain column should be removed, so you filter on the column, excluding 0001-01-01 values. After you apply your filter, nothing changes: the 0001-01-01 rows are still present. What is going on?

This issue bit me recently. Turns out, it’s due to a translation loss when bridging between the worlds of MySQL and Power Query.

Continue reading

No Built-In Power Query Connector: Am I Stuck?

, ,

Uh oh! You’re using Microsoft Power BI or Excel and you’ve discovered that Power Query does not have a built-in connector for the data source you’re interested in consuming.

Question: What does this mean for you?

Answer:
(pick which of the below you think is most likely correct)

  • No go. You cannot access this data source from Power Query.
  • Custom connector. “Built-in connector” and “custom connector” sound like opposites. Either you may need to fork out a bit of $$ to get a custom connector created, or give up.
  • Something else. Hmm…what are the other options?

I’d suggest starting with “something else.” When Power Query doesn’t ship with a connector for your source of interest, there are a number of other options to consider.

Let’s take a journey through the main other options for directly connecting to a source from Power Query. We’ll order these into “stages” based on an approximation of the level of effort involved. Then, we’ll expand our scope and touch on several possibilities that involve using external tools or languages to provide the needed data to Power Query.

But first, the problem scenario….

Continue reading

An Error’s Expression Stack: A “Journal” of the Locations It Propagates Through

, , ,

A Power Query mashup expression dies with an error. As the error propagates through your code, did you know that it sometimes collects a “journal” of the locations (e.g. line numbers) it passes through? This “expression stack,” as it’s called, can be used to help identify the troublemaking line of code.

An error’s expression stack is not automatically exposed in any user interface. Its hidden presence suggests that it may be a component supporting some past, present or planned UI functionality (perhaps it’s part of powering the “Go to Error” button?). Even though it is hidden, there may be cases where you find the location details it contains useful when debugging. Even if not, knowing about it is interesting Power Query trivia. 🙂

Continue reading

Zero Rows Can Bite (part 2): The Mysterious All-Null Row

, , ,

The table you fetch from a web API mysteriously contains an unexpected row with a null value in each column. You manually try the API using a tool like Postman or Insomnia and don’t find any all-null objects in the raw response. Where is this null table row coming from?

Zero rows (again!).

Previously, we dug into refreshes mysteriously dying with the complaint that “column ‘Column1’ of the table wasn’t found” even though no M code or data source schema changes had occurred. Upon investigation, we learned that an insufficiently in some fetch data M code results in it outputting zero columns when the web API returns no rows, which breaks later code that expects the presence of specific columns.

This time, zero rows is again the trigger condition, though it’s not zero rows altogether. Instead, it’s when a web API that returns paged responses returns a page containing no rows. Receiving back an empty page is a real-world possibility. For example, the last page of a response might contain zero rows because the rows that were to have been in it were deleted just moments ago, after the preceding page was fetched.

A common pattern for processing paged responses is to read the various pages into a list, then turn that list into a table, which is then expanded out into the appropriate rows and columns. However, the implementation of this flow sometimes leaves a corner case unaccounted for which leads to the all-null row being present. Unfortunately, such an oversight is present in Table.GenerateByPage (a function commonly used by custom connectors).

Continue reading

New M Feature: Structured Error Messages

, , ,

Why Structured Error Messages?

In the real world, errors are a part of life. If you access and read data from real, in-production systems, sooner or later you will almost certainly encounter errors. While you may be unable to escape their unfortunate reality, at least in the Power Query world, they’re rendered out in an easy-to-read format:

Expression.Error: Bad code 'ABC', problem 'too short'

Easy to read, that is, if you are a human, reading just one error all by itself.

But what if you’re trying to analyze a collection of error messages? Imagine a set of errors like the above, but which are for a variety of different codes and problems (e.g. bad code ‘A235’, problem ‘must contain at least 2 letters’, bad code ’15WA’, problem ‘cannot start with a number’, etc.).

Let’s say you want to summarize these errors, reporting out the count of errors per problem, per bad code. Manually reading errors one at a time no longer cuts it. Instead, you could write code that parses each error message, extracting the text between the phrase bad code ‘ and the following quote character, and between problem ‘ and the following quote character. With the code and problem statement now separately captured, you can use their values to group by or otherwise compute the desired summaries.

Parsing log messages like this this involves coding work. Not only does it take effort on your part, but it is also tricky to get right. For example, the logic described above finds the end of each string it matches by looking for the next quote character. What if a bad code or problem description includes a quote character? The logic we’ve been considering won’t match the entire value. Say, the message starts with Bad code ‘ABC’DEF’. The above logic will miss the second portion of the code (only capturing ABC, not the full ABC’DEF) because it incorrectly assumes that a bad code will never contain a quote. You could address this by writing more robust parsing code, but that’s more work—and this is only one example of the corner cases you may need to handle to accurately parse a family of log messages.

On the other hand, maybe your interest is not analyzing log message parameters, but rather removing them altogether. For data privacy or security reasons, you want sanitized log messages, where parameter values have been stripped out and replaced with generic placeholders. This way, “clean” log messages can be aggregated or retained long-term without the complications that accompany storing PII or other confidential information that may have found its way into error message parameters. While this may be the opposite of our first scenario (extracting message parameters for analysis purposes), implementing it still requires a technical means to differentiate between the base log message pattern (or template) and the parameters that have been filled into it. If you’re implementing this yourself, you’re looking at some form of log message parsing.

In either case, if only there was a way to avoid the effort and complexity associated with writing log message parsing code….

Introducing M’s Structured Error Messages

Meet M’s new structured error message capabilities!

M’s error functionality has recently been expanded to offer a new way of defining error messages, splitting message definition between a template and a list of parameter values. These components are preserved with first class representation in the error after it is raised, enabling error handling code (and, potentially by extension, external logging mechanisms and log analytics tools) to separately work with these components without the need for custom text parsing. This style of error message is known as a structured error message and is key to making structured logging possible.

Continue reading

The Elusive, Uncatchable Error?

, , , ,

The error seems to escape catching. When the expression is evaluated, Query Editor displays the error. Try loading the query’s output into Microsoft Power BI or Excel and the operation dies with an error. Clearly, there’s an error—but if you wrap the expression with a try, the error isn’t caught! To the contrary, the record output by try reports HasError = false, even though if you access that record’s Value field, Query Editor again shows the error.

What’s going on?! Have you discovered an uncatchable error? Is this a Power Query bug?

Continue reading

Zero Rows Can Bite (part 1): The Mysterious Missing Column

, , , , ,

Your Power Query expression is happily skipping along, fetching data from a web API. Happily, that is, until one day its refreshes mysteriously die with the complaint that “column 'Column1' of the table wasn't found“. You haven’t changed any M code. You’ve verified that no schema changes occurred on the external web service. How, then, could a column come up missing?

Power Query error message - Expression.Error: The column 'Column1' of the table wasn't found. Details: Column1

Or, maybe it’s not a missing column, but rather your fetch data code starts outputting an unexpected table row with a null value in each column. You manually try the web API using a tool like Postman or Insomnia and don’t find any all-null objects in the API’s response. Where is this all-null table row coming from?

Both of these unexpected occurrences potentially stem from the same underlying cause. As common M code patterns tend to not properly handle this situation, it is possible (even probable!) that M code you use may leave you susceptible to being bitten by one or the other of these bugs.

Continue reading

Combining Query String Parameters—a.ka. Inclusive Record Field Combining

, ,

Your expression is building a query string for use in a Web.Contents call. Different parts of your code each separately compute parameters that should be included in the final query string. These are provided as “query fragment” records, which all need to be combined into a single, consolidated record that’s then passed to Web.Contents:

ProductCodeFragment = GetProductCodeQueryFragment(), // might return [productId = 123]
LimitFragment = GetLimitQueryFragment(), // might return [limit = 100]
FinalQueryParams = (single, consolidated record containing query parameters from all fragment records, such as ProductCodeFragment and LimitFragment),
Result = Web.Contents("some-url", [Query = FinalQueryParams]

Using Power Query’s built-in functionality, combining the fragment records into a consolidated record is easy, so long as the fragment records each have different fields. If they do, a record merge can be performed using the combination operator (&) or the records can be combined using Record.Combine, both of which produce the same output.

[productId = 123] & [limit = 100] // outputs [productId = 123, limit = 100]]
Record.Combine({[productId = 123], [limit = 100]}) // outputs [productId = 123, limit = 100]

The challenge comes if multiple records contain the same fieldsay, one fragment record contains [productId = 123] while another contains [productId = 456]. Record.Combine and & are exclusive in how they compute the field values they output. When the same field name is present in multiple input records (e.g. both input records contain field productId), the value for that field that’s output will be the value from the last/right-most input record (in this case, productId = 456). The other input record value(s) for that field will be ignored (so in this case, productId = 123 is ignored).

// notice that productId 123 is *not* included in the outputs
[productId = 123] & [productId = 456] // outputs [productId = 456]
Record.Combine({[productId = 123], [productId = 456]}) // outputs [productId = 456]
Continue reading