My sed attempts on RHEL 6.3:
$ export LC_ALL=fr_FR.utf-8
$ sed 's/ \([a-zA-Zé]\)\([^ ]*\) /[\u\1\L\2\E] /g' <<< " hélène NOËL étienne "
hélène NOËL étienne
$ export LC_ALL=C
$ sed 's/ \([a-zA-Zé]\)\([^ ]*\) /[\u\1\L\2\E] /g' <<< " hélène NOËL étienne "
[Hÿlÿne] [Noÿl] [ÿtienne]
$ sed --version
GNU sed version 4.2.1
[...]
Is sed able to output the following?
[Hélène] [Noël] [Étienne]
is this ok for you?
kent$ echo " hélène NOËL étienne "|sed -r 's/(\S)(\S+)/[\U\1\L\2]/g'
[Hélène] [Noël] [Étienne]
my sed version is abit different from yours, but I think the line should run there too:
kent$ sed --version |head -1
sed (GNU sed) 4.2.2
added my locale settings, you may want to know:
kent$ echo $LANG
en_US.utf8
kent$ locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=