I tried the following:
sed -e 's/ü/\\"u/g' filename.tex>filename2.tex
but my terminal doesn't recognise the umlaut, so replaces all u
with \"u
. I know that tex has packages and what-nots that might solve this problem, but I am interested in a sed way for the moment.
The fundamental problem is that there is a complex interaction between sed
, your locale, your terminal, your shell, and the file you are operating on. Here is a list of things to try.
If you are lucky, your shell, sed
, and the file you are working on have complete agreement on what the character you are trying to replace should be represented as. In your case, you already tried that, and it failed.
sed 's/ü/\\"u/g' filename.tex
If you are only slightly less lucky, the other parts are fine, and it's just that your sed
is not modern enough to grok the character sequence you are trying to replace. A trivial sed
script like yours can be simply passed to perl
instead, which usually is more up to date when it comes to character encodings.
perl -pe 's/ü/\\"u/g' filename.tex
If the character encoding is UTF-8, you may need to pass a -CSD
option to Perl, and/or express the character you wish to replace with an escape of some sort. You can say \xfc
for a raw hex code (that happens to be ü
in Latin-1 and Latin-9) or \x{00fc}
for a Unicode character, or even \N{LATIN SMALL LETTER U WITH DIAERESIS}
; but notice that Unicode has several representations for this glyph (precomposed or decomposed, normalized or not). See also http://perldoc.perl.org/perlunicode.html
(For in-place editing, perhaps you want to add the -i
option, too.)
Finally, you may need to break down and simply figure out the raw bytes of the character code you want to replace. A few lines of hex dump of the problematic file should be helpful. After that, Perl should be able to cope, but you need to figure out how to disable character set encoding and decoding etc. If, say, you find out that the problematic sequence is 0xFF 0x03
then perl -pe 's/\xff\x03/\\"u/g' filename.tex
should work.