Search code examples
haskellunicodebytestringaeson

Read unicode from JSON to String field using aeson


I receive a JSON data using httpLbs and read it

import qualified Data.ByteString.Lazy.UTF8 as LB

sendSimpleRequest :: Credentials -> IO LB.ByteString
sendSimpleRequest creds = do
    <...>
    let request = applyBasicAuth user pass $ fromJust $ parseUrl url
    manager <- newManager tlsManagerSettings
    response <- httpLbs request manager
    return $ responseBody response

After that, I can print the result with putStr . LB.toString and get "summary":"Обсуждение рабочих вопросов".

However, when I try to use aeson's decode to put this value into data and print it

data Fields = Fields
    { fi_summary :: String
    } deriving (Show, Generic)

instance FromJSON Fields where parseJSON = genericParseJSON parseOptions
instance ToJSON Fields where toJSON = genericToJSON parseOptions

parseOptions :: Options
parseOptions = defaultOptions { fieldLabelModifier = drop 3 }

parseAndShow = putStr . show . fromJust . decode

I get escaped characters: Fields {fi_summary = "\1054\1073\1089\1091\1078\1076\1077\1085\1080\1077 \1088\1072\1073\1086\1095\1080\1093 \1074\1086\1087\1088\1086\1089\1086\1074"}

Seems like I need to configure aeson to correctly put ByteString into String, but I don't want to implement the FromJSON instance myself because I have a dozen more structures like data Fields. Changing the fi_summary type is also a possibility, but I had no luck with any so far.


Solution

  • If you're seeing escaped characters, then the data is in the string just fine. The default string Show instance prints all non-ASCII characters like this. So you're got the data, it's just a matter of trying to output it again appropriately.

    You might try using putStrLn to print the string, or maybe write it to a text file. (I know sometimes putStrLn does strange things if the locale is set wrong...)