Search code examples
linuxsedspecial-characters

How to remove degree symbol (M-0 aka superscript zero?) with sed


I have a file that includes temperatures along with a degree symbol that I want to remove. It looks like this in Notepad++:

40238230,194°,47136

The symbol does not print with a plain cat:

40238230,194,47136

But cat -e shows M-0 where the symbol is:

40238230,194M-0,47136

How can I get rid of that symbol? I thought the following sed would do it (by including only digits and commas), but doesn't:

sed -r 's/[^0-9\,]//g'

Solution

  • Could it be that you have not setup up your console to use Unicode?

    The degree sign is Unicode &#x00B0. In UTF-8 this is \xc2\xb0. So if you console is not using Unicode you will have to replace those two bytes.

    The M- notation is described here: What is the "M- notation" and where is it documented?.

    M-0 is 0xb0
    

    On a console with Unicode enabled I get:

    $ cat foo
    122 °C
    $ cat -e foo
    122 M-BM-0C$
    

    Now for removing with sed read: Remove unicode characters from textfiles - sed , other bash/shell methods