Category Archives: Programming

Lazy, Streamed, Immutable: Try Building a Table

, , , , ,

Lazy evaluation, streaming and immutability are key Power Query concepts which must be understood to truly grasp how Power Query “thinks.” Want to test your understanding in these foundational areas—and try to grow it—by tackling an assignment?!

Your Task

Code up a single row table containing the following columns:

  • A column of hard-coded data.
  • A couple columns whose data is fetched from an API.
  • A couple columns whose data is fetched from another API.
Example of expected table

Where:

  • Neither API is called if the table’s rows are simply counted.
  • Only the first API is called if the columns containing data from the second API are removed from the table, and vice versa.
  • Each API is called at most once, even when multiple columns that contain data from that API are output.

Without:

  • Using any standard library table functions to build the table (i.e. no “Table.*” functions may be used).
Continue reading

Recursive Polymorphic Deserialization with System.Text.Json

,

Goal: Deserialize a nested JSON structure, where the objects instantiated are not instances of the given abstract base type but rather are of the appropriate derived types.

In other words, if the specified destination type is AbstractBaseTpe, which should be returned is not an instance of that class but rather an instance of ChildTypeA or ChildTypeB or GrandchildType1, etc., as appropriate. This polymorphic deserialization behavior should hold true at the root level (that is, for the type directly returned by JsonSerializer.Deserialize<AbstractBaseType>(…)), as well as recursively for nested properties (that is, anytime the destination property is of type AbstractBaseType, the object instantiated for the property should be of the appropriate derived type, like ChildTypeA, etc.).

Hopefully, one day support for something along these lines will be built into System.Text.Json. However, for now, when we need it, we have craft our own solution (or borrow one from someone else).

In my case, the other day I needed a simple solution for polymorphic deserialization. Performance wasn’t critical. Just something simple that worked.

Continue reading

Always Identical Outputs? Switch Statement vs. Switch Expression

, ,

Are these two statements identical in what they return?

public object Switcher(bool flag) 
{
  switch (flag) {
    case true:
      return 1;
    case false:
     return 10.5;
  }
}
public object Switcher(bool flag) 
{
  return flag switch {
    true => 1,
    false => 10.5
  };
}

No.

TestResult –
Switch Statement
Result –
Switch Expression
Assert.Equal(1, x.Switcher(true));PassFail – Error Message:
Assert.Equal() Failure
Expected: 1 (System.Int32)
Actual: 1 (System.Double)
Assert.Equal(10.5, x.Switcher(false));PassPass

Scratching your head trying to find the difference? The logic in both looks identical. Arguably, it is.

Continue reading

Power Query M Primer (Part 21): Identifier Scope & Sections

, , , ,

The same identifier name (think: variable name, field name, etc.) can be defined more than once in the same set of Power Query expressions. If you reference an identifier name that’s been defined in multiple places, which of those definitions will your reference point to?

In this post, let’s learn how M sorts this out. We’ll also explore sections—the usually hidden “frame” at the core of organizing the different expressions that make up a Power Query program.

Let’s get to it…and have fun while we’re at it!

Continue reading

Migration Thoughts: Raw Data vs. Real Data

, ,

A challenge when migrating data to a new system is that the legacy database may only contain a representation of the old system’s data. Arguably, the actual to-be-migrated data is what a user of the old system sees when using its user interface, reports or APIs. This data isn’t always exactly what is present in the database. Whether purposefully or inadvertently (think: bugs), the legacy system may apply transforms, filtering and such to the database’s raw data before displaying or outputting it. In order to create an accurate export for migration, these data manipulations need to be factored in.

From one migration, I remember an email address table along the lines of the following. This table’s design allows a customer to have multiple email addresses, including multiple primary email addresses (as the example below shows).

EmailId  CustomerId  EmailAddress       IsPrimary
-------  ----------  -----------------  ---------
1        15835       joe@example.com    1
2        15835       jsmith@work.com    1
3        15835       jsmith@home.com    0
4        321163      bob@elsewhere.com  1

If this system’s database were the exhaustive, authoritative source for its data, it would seem reasonable to conclude that all of the table’s email addresses should be included in the dataset to be migrated.

However, in the old system’s user interface, a customer could only have two email addresses: one primary and one secondary. Due to a bug, some email address edits resulted in new address rows being created instead of existing rows being updated.  The system, perhaps inadvertently, was smart enough to ignore these old rows, so their presence didn’t cause it problems. Even though these rows were physically present in the database, they weren’t a part of the system’s authoritative data because from the user’s perspective they did not exist; instead, they were data corruptions which needed to be ignored during the export extraction process.

Continue reading

Asserting on IEnumerable<T> + Race Conditions = Danger

, , ,

Ran into this xUnit test failure the other day:

Assert.Equal() Failure
Expected: Boolean[] [True, False, False]
Actual: SelectListIterator<Process, Boolean> [True, False, False]

Why the failure?

The values displayed for expected and actual match. The only difference visible in the error message are the collection types…but that can’t be the source of failure because xUnit’s Assert.Equal<T>(IEnumerable<T> expected, IEnumerable<T> actual) doesn’t compare the collection types of its arguments, just elements and their positions.

So why the failure? Continue reading

Automating Column Name Renames

, , ,

Ever rename a table’s column names to transform their style from what makes sense in the database world to what looks attractive in a reporting environment?

Perhaps a data source returns field names like:

user_id, username, first_name, last_name, is_active_user

In your report, you’d like these columns to show up as:

User ID, Username, First Name, Last Name, Is Active User

The obvious way to achieve this is to use the query editor to manually rename each column. For five columns in one table (such as the above example), hand-transforming their names isn’t too tedious. However, imagine a report that uses ten or twenty tables, each with twenty or thirty columns—and it’s a different story.

Continue reading

SQL for C# Developers

, ,

You’re a developer. You know how to write code that interacts with databases. Using tools like Entity Framework and LINQ, you can push and pull data to and from data sources. But how comfortable are you directly talking to a relational database in its native dialect—SQL? Maybe you can cobble together a basic SELECT statement—but when you do, do you feel like a traveler in a strange land where you barely know the language?


Continue reading

Understanding Cancellation Callbacks

, , , ,

In the ideal world, all .Net asynchronous methods support cancellation tokens: When invoking a method, simply pass it a cancellation token. Then, at the appropriate time, cancel the token and the asynchronous operation terminates.

Alas! We don’t live in the ideal world. Not every method we might asynchronously invoke works with cancellation tokens. When faced with an asynchronous operation we want to be able to cancel that doesn’t support cancellation tokens, one option is to implement our own cancellation logic by registering a callback with the token. When the token is cancelled, callbacks registered with it are executed.

Introducing callback into the picture raises questions around if/when, where and how those callbacks are executed. For example:

  • Will the callback be invoked via the synchronization context that was current when it was registered?
  • If multiple callbacks are registered, are they run synchronously or in parallel?
  • If a callback raises an exception, can that exception be caught?

In the case of .Net’s Task Parallel Library and its CancellationToken and CancellationTokenSource, the answers to these questions revolve around when cancellation occurs and how it is triggered.
Continue reading