I have a file that includes temperatures along with a degree symbol that I want to remove. It looks like this in Notepad++:
40238230,194°,47136
The symbol does not print with a plain cat
:
40238230,194,47136
But cat -e
shows M-0
where the symbol is:
40238230,194M-0,47136
How can I get rid of that symbol? I thought the following sed would do it (by including only digits and commas), but doesn't:
sed -r 's/[^0-9\,]//g'
Could it be that you have not setup up your console to use Unicode?
The degree sign is Unicode °. In UTF-8 this is \xc2\xb0. So if you console is not using Unicode you will have to replace those two bytes.
The M- notation is described here: What is the "M- notation" and where is it documented?.
M-0 is 0xb0
On a console with Unicode enabled I get:
$ cat foo
122 °C
$ cat -e foo
122 M-BM-0C$
Now for removing with sed read: Remove unicode characters from textfiles - sed , other bash/shell methods