Search code examples
internationalizationbabeljs

Handling comments in code when doing i18n


I'm in the process of translating a Open Source project from Chinese to English, and I've used i18n (in this case babel) to separate the code from both English and Chinese translations.

Everything's done, except for a rather large number of inline comments in the code.

Obviously, babel can't translate comments inline (and it would be rather obnoxious if it did, anyway. Since code would not be unique across languages and therefore less easily verifiable.)

The way I see it, there are a number of options:

  1. Leave comments in -

    Pro: Helps original author, etc.

    Con: Makes it distracting for ongoing translation and anyone who doesn't speak the language

  2. Strip out all the comments -

    Pro: Code is native-language-agnostic, so it makes sense. Who needs comments anyway? Use the source, Luke!

    Con: Goes against SE principles. Could lose something important in understanding how the code works - maybe something's been done to avoid a security risk, etc.

  3. Place English comments near Chinese comments

    (Possibly moved to lines above and prefixed with "EN" and "ZH", for example).

    Pro: Best of both worlds, comments kept close to code

    Con: Not conducive to dictionary-style translation. Can get bulky with more languages.

  4. Create a comment dictionary / notes

    Pro: Keeps the comments in a separate file for easy translation.

    Con: Difficult to keep synced with code. Not intuitive to remember to update comments related to code when changing coe.

  5. Use a different preprocessor for i18n before/after each development cycle.

    Pro: Comments et al would be in your language. Could link this to git pull/push so you only ever see the code in your language.

    Con: Bulky, non-obvious process. Could result in code-verification or even compilation errors.

None of these seem like really great solutions.

If you do alot of this, and the code is shared publicly between developers who don't share a native tongue, is there a recommended way to handle translating (or not) comments in the code itself?


Solution

  • Short Answer

    It seems to be a mixture of:

    1. Strip out all the comments, and
    2. Place English comments near Chinese comments.

    Inline comments are almost always trivial - Strip them

    Functional comments are not as intrusive - Translate them (possibly with a i18n prefix e.g. "[cn]:" or "[en]:").

    Explanation

    My meagre amount of research tends to suggest that larger projects make strong attempts to reduce comments and let the code explain itself, instead focusing on code quality which reduces the need for comments.

    e.g. From the Linux Kernel Coding guidelines:

    NEVER try to explain HOW your code works in a comment: it's much better to write the code so that the working is obvious, and it's a waste of time to explain badly written code.

    ...and from the MySQL coding standards:

    Comment your code when you do something that someone else may think is not trivial.

    Both of these standards (and others) recommend minimal function descriptions also, so that's not as obtrusive to understanding the code, and, since function descriptions are generally multi-lined and above the code itself, multiple languages can be included as necessary.

    Maybe someone, somewhere has built an interface that can isolate comments into the readers language, but I couldn't (yet) find any that do so.