Search code examples
pythontranscrypt

Dealing with à in generating code for a string literal using Python 3.5's AST module, need to open with right coding


To generate JavaScript from Python in the Transcrypt Python to JS compiler, Python 3.5's ast module is used in combination with the following code:

class Generator (ast.NodeVisitor):
    ...
    ...

    def visit_Str (self, node):
        self.emit (repr (node.s))  # Simplified to need less context on StackOverflow

    ...
    ...

This works fine e.g. for the following line of Python:

test = "âäéèêëiîïoôöùüû"

which is correctly translated to:

var test = 'âäéèêëiîïoôöùüû';

Only the character à gives problems:

test = "àâäéèêëiîïoôöùüû"

is translated to:

var test = 'Ĝxa0âäéèêëiîïoôöùüû';

Is there any way to have the ast module read the source file respecting coding directives like:

# coding=<encoding name>

Solution

  • To open a Python file for parsing, use

    tokenize.open
    

    rather than the ordinary

    open
    

    function.

    It will open, read the pep263 coding hint and return the open file as if it were opened by the ordinary open using the right encoding.

    Quite hard to find, not currently in the Green Tree Snakes doc. Actually found it by searching for 'coding' in the CPython sources on GitHub.

    Have created an issue for Green Tree Snakes doc to add this.