I would like to run a bash command from Haskell which has unicode file paths.
Strings in Haskell use \escapes e.g
"beißen" -> "bei\223en"
Bash seems to accept the following formats:
$'bei\xC3\x9Fen.avi'
and 'beißen.avi'
since runCommand
from System.Process
has the type
runCommand :: String -> IO System.Process.Internals.ProcessHandle
How do I encode the Haskell String to one of the formats that Bash accepts?
using Mac OSX 10.8.4 which has bash 3.2 .
EDIT
my problem seem to do with bash escaping
I am using Text.ShellEscape
(http://hackage.haskell.org/packages/archive/shell-escape/0.1.2/doc/html/Text-ShellEscape.html) to escape the characters that need be escaped for bash
e.g
import qualified Data.ByteString.Char8 as B
import qualified Text.ShellEscape as Esc
let cmd = B.unpack $ Esc.bytes $ Esc.bash . B.pack $ "beißen.txt"
which gives me "$'bei\\xDFen.txt'"
when running runCommand $ "ls " ++ cmd
it gives me
ls: bei�en.txt: No such file or directory
It there a better way to escape strings for bash?
Data.ByteString.Char8
is almost never the right choice if you want to deal with non-ASCII text. It will mangle your data. In your case you probably should use Data.ByteString.UTF8
instead (provided you use a UTF-8 locale, which is the case for most modern desktop Unix-y OSes).
Example of Data.ByteString.Char8
mangling data:
Prelude Data.ByteString.Char8> "été"
"e\769te\769"
Prelude Data.ByteString.Char8> unpack $ pack "été"
"e\SOHte\SOH"
Prelude Data.ByteString.Char8> Prelude.putStrLn "été"
été
Prelude Data.ByteString.Char8> Prelude.putStrLn $ unpack $ pack "été"
ete
Use Data.ByteString.UTF8.toString
and not Data.ByteString.Char8.unpack
.
These invocations
let s = toString $ bytes $ bash $ fromString "мама.sh"
runCommand s
runCommand $ "ls -l " ++ s
work for me from within ghci ("мама.sh"
is a shell script with some Cyrillic characters in the name).
Of course if you escape the entire command it will also escape the white space and it will not work. Escape each word of the command individually.