Adding an Error Details Column

, , ,

Errors here and there, scattered throughout a table—unfortunately, sometimes this is the reality, especially when source data quality is low. To help monitor issues, you decide you want an error details column. At minimum, it should show the name of each column in the row with an error. This will allow you to, say, filter to all rows where column “Item ID” is in an error state. Taking it a step further, it would be really nice if the error column also contained the error message associated with each erroneous column. With this, you could compute statistics like “how many times did we struggle to convert strings into dates”—information which could help identify the most frequently occurring data quality issues and so guide you to which you might want to consider addressing first.

Screenshot of table with error details column added

Here is one way to pull this off:

let
    Source = #table({"SomeColumn", "AnotherColumn" }, { {1, error "help"}, { 2, "B"}, { error "bad", "C"} }),
    RowErrorsToTable = 
        (row as record) as nullable table => let
            RecordToTable = Record.ToTable(row),
            AddedErrorsColumn = Table.AddColumn(RecordToTable, "Errors", each let Test = try [Value] in if Test[HasError] = true then Test[Error] else null),
            RemovedDataColumn = Table.RemoveColumns(AddedErrorsColumn, {"Value"}),
            FilteredToErrors = Table.SelectRows(RemovedDataColumn, each [Errors] <> null)
        in
            if Table.IsEmpty(FilteredToErrors) then null else FilteredToErrors,
    AddedErrorsColumn = Table.AddColumn(Source, "Errors", RowErrorsToTable)
in
    AddedErrorsColumn

How It Works

  1. It all starts with a source table (line 2). Here, pretend data is used with errors hard-coded in. In real life, Source would be a real table from a real source, with any errors being present because of real issues with the data and the transforms previously applied to it.
  2. Column Errors is added (line 11), with its value computed by calling RowErrorsToTable (line 3) for each row in the table.
  3. RowErrorsToTable is passed each table row as a record. It uses Record.ToTable (line 5) to unpivot this record into a new table, containing one row per record field, with the field’s name in column Name and its value in column Value.
    Record.ToTable's output
  4. Any errors present in column Value are caught and put in a new column named Errors (line 6). If Value is error-free for a particular row, its corresponding Errors value is set to null.
  5. Column Value is removed from the row’s table (line 7), as it is no longer needed, and that table is filtered (line 8) to exclude rows which don’t have an error.
    Final contents of row's error table
  6. The resulting row errors table (or null) is returned (line 10) as the value to be stored in the original table’s new Errors column for the given row.

The net effect is that the original table’s new Errors column contains a nested table holding error details for each row that has errors in any of its other columns. If, instead, the row’s other columns are all error free, the new column’s value is set to null. You can use this new error details column’s contents to filter and analyze the original table’s errors, as desired.

Leave a Reply

Your email address will not be published.