Search code examples
texttranslationdecodeencode

What encoding is this and how can I decode it?


I've got an old project file with translations to Portuguese where special characters are broken:

error.text.required=\u00C9 necess\u00E1rio o texto.
error.categoryid.required=\u00C9 necess\u00E1ria a categoria.
error.email.required=\u00C9 necess\u00E1rio o e-mail. 
error.email.invalid=O e-mail \u00E9 inv\u00E1lido.
error.fuel.invalid=\u00C9 necess\u00E1rio o tipo de combust\u00EDvel.
error.regdate.invalid=\u00C9 necess\u00E1rio ano de fabrica\u00E7\u00E3o.
error.mileage.invalid=\u00C9 necess\u00E1ria escolher a quilometragem.
error.color.invalid=\u00C9 necess\u00E1ria a cor.

Can you tell me how to decode the file to use the common Portuguese letters?

Thanks


Solution

  • The "\u" is prefix for unicode. You can use the strings "as is", and you'll have diacritics showing in the output. A python code would be something like:

    print u"\u00C9 necess\u00E1rio o texto."
    

    which outputs:

    É necessário o texto.

    Otherwise, you need to convert them in their ASCII equivalents. You can do a simple find/replace. I ended up writing a function like that for converting Romanian diacritics a while ago, but I had dynamic strings coming in...