Search code examples
htmlpowerbidata-conversion

Convert HTML column to plain text in Power BI


Power Bi imports (e.g. from Outlook calendar) some data with HTML tags.

I need to remove these and the simplest method is to create a custom column using Html.Table

Html.Table([HTML text],{{"Plain text",":root"}})

where "Plain text" is the new column.

This works fine, except if there are empty cells in the "HTML text" column. The first empty cell (null) causes an error:

"We cannot convert the value null to type Text"

And the conversion stops.

Is there a condition that can be applied "If null then null else" to solve this error?

Many thanks!

P.S. I have seen @Alexis Olson thread Convert HTML Table to plain Text in Power BI but the Html.Table solution is simpler.


Solution

  • Assuming the HTML is in a column titled Description, you can use the FieldValuesAsText column to convert the column to text:

    1. Open the query editor (Transform data).
    2. Remove the Description column.
    3. Scroll right and select the expander icon for the “FieldValuesAsText” column.
    4. Deselect all available fields except the “Description” column.
    5. Deselect the “Use original column name as prefix” option.

    screenshot

    1. Select OK.

    You now have a Description column free of all HTML formatting tags.

    Consider duplicating the “FieldValuesAsText” column before using the only one in the dataset!

    Solution by John White – full description here: https://whitepages.unlimitedviz.com/2018/04/power-bi-report-rich-text-sharepoint/#comment-55053