Blog Post

Power BI: Dynamically Removing Errors From Columns in M

,

If you have excel data, or user-entered data, you have likely experienced frustration from repeatedly seeing the error message below upon hitting “Close & Apply” or refreshing your data model. It can be extremely frustrating to see failed refresh emails come in every time a user types in a text value in a number field or adds a formula to the report that results in #N/A.

End users that are not trained in data governance but are actively involved in maintaining a data set can easily make data entry mistakes or create inconsistent data types within columns. For example, you may have a column in the dataset called “Sales” and instead of 0, someone may type “None” or “NA”. When this gets loaded into Power BI, Power BI will not know how to convert the text value “None” to a number, and it will throw an error on the refresh of the report.

One way to mitigate the impact of user-entered data is to replace errors with null values. This is not ideal since it doesn’t fix the data entry issues, but it does enable reports to still be refreshed and used while the data issues are addressed. In Power Query, you can manually replace the errors with null by going to the “Transform” tab then selecting the drop down for “Replace Values” and choosing “Replace Errors”.

After selecting “Replace Errors”, type null (all lowercase) into the value field to replace your errors with a null value that will allow the report to refresh no matter the data type.

As you can imagine, it can get quite tedious to apply this step to every column in every query in your report. Good news! I have a query that you can add to the end of your applied steps that will make sure every single column in your query will replace errors with nulls. Let’s walk through it.

Final Query:

let
    Source = Excel.Workbook(File.Contents("C:UsersKristynaHughesOneDrive - DataOnWheelsGitHubAdventureWorks Sales w Errors.xlsx"), null, true),
    Sales_data_Sheet = Source{[Item="Sales_data",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Sales_data_Sheet, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"SalesOrderLineKey", Int64.Type}, {"ResellerKey", Int64.Type}, {"CustomerKey", Int64.Type}, {"ProductKey", Int64.Type}, {"OrderDateKey", Int64.Type}, {"DueDateKey", Int64.Type}, {"ShipDateKey", Int64.Type}, {"SalesTerritoryKey", Int64.Type}, {"Order Quantity", Int64.Type}, {"Unit Price", type number}, {"Extended Amount", type number}, {"Unit Price Discount Pct", Int64.Type}, {"Product Standard Cost", type number}, {"Total Product Cost", type number}, {"Sales Amount", type number}, {"Large Sales", type number}}),
    //ColumnsInTable grabs all the column names in your existing table dynamically
    ColumnsInTable = Table.ColumnNames(#"Changed Type"), 
    //Converting this list of columns to a table lets us add a column for what we will replace errors with
    #"Converted to Table" = Table.FromList(ColumnsInTable, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    //This adds a column to show what we will replace errors with
    #"Added ReplacementColumn" = Table.AddColumn(#"Converted to Table", "Replacement", each null),
    #"Transposed Table" = Table.Transpose(#"Added ReplacementColumn"),
    //Transforms the table back into a list so we can use it in our replace error step
    ColumnList = Table.ToColumns(#"Transposed Table"),
    //Takes our last step in our original table called Changed Type and replaces all errors in all our columns with null
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type",ColumnList)
in
    #"Replaced Errors"

In the example above, #”Changed Type” is the last step in my original query but this resulted in some errors in my table (see screenshot below).

In order to apply a replaced error step to all the columns in the table, we need to understand a bit more about the Table.ReplaceErrorValues function. This function is comprised of two arguments: (table as table, errorReplacement as list). The table portion will be the name of our last step because that is the final table that we want to manipulate (#”Changed Type” in this example). The error replacement list will require two values – the column name and the value we want to replace errors with. To build the list, follow the steps below.

  1. Table.ColumnNames(#”Changed Type”) : this function will look at the last step in your query (#”Changed Type” in this example) and generate a list of the columns

2. Table.FromList(ColumnsInTable, Splitter.SplitByNothing(), null, null, ExtraValues.Error) : this function converts our list of columns into a table. The reason we are converting a list to a table then back to a list is so we can easily add in a column of nulls to our list.

3. Table.AddColumn(#”Converted to Table”, “Replacement”, each null) : this function adds a column called “Replacement” that contains the null values that we want to replace our errors with.

4. Table.Transpose(#”Added ReplacementColumn”) : transposing the table will make it easier to put our values into a list appropriately. We need the final result to be a comma separated record that has the column name then the replacement null value.

5. Table.ToColumns(#”Transposed Table”) : this function flips our transposed table back into a table full of lists that we can use in our final function. By clicking on the cell in our list of lists, you can see that the list is comprised of our column name and our replacement value. Pretty handy for our replaced error function.

6. Table.ReplaceErrorValues(#”Changed Type”,ColumnList) : time to replace our errors. Notice that we have #”Changed Type” as our first argument. This will force Power Query to grab our original last step and use that version of our table for the function. ColumnList is the name of the step that contained our list of lists with our column names and null values inside.

And there ya have it! To use the query above, you’ll need to swap #”Changed Type” with whatever the name is of your final step in your original query. Pretty handy when dealing with user-entered data, but keep in mind the ideal solution would be to put more data governance around the original data entry. For example, a simple application that forces people to enter the correct data type or else it throws an error or a drop down selection with valid options (this would be a great option for zip codes or states) would prevent data entry issues in the first place and raise the quality of your data.

Additional Resources

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating