Search code examples
pythonjvmllvmpypyparrot

LLVM, Parrot, JVM, PyPy + python


What is the problem in developing some languages, for example python for some optimized techniques with some of LLVM / Parrot.

PyPy, LLVM, Parrot are the main technologies for common platform development.
I see this like:

  • PyPy - framework to build VM with build in optimized VM for python
    So it quite general solution. The process goes as listed down:
    1. dynamic_language_code ->
    2. PyPy frontend ->
    3. PyPy internal code - bytecode ->
    4. PyPy optimization ->
    5. leaving PyPy code and:
      a. PyPy backend for some VM (like jvm)
      b. som Kit to make own VM
      c. processing/running PyPy internal code

Am I right About this process? For python there is optimized VM? Particularly by default there is build in VM for optimized PyPy code (step 5.c) - which is for python and every language processing can stop there and be running by it?

  • Parrot - much like PyPy, but without 5.a and 5.b ? Some internal improvements for dynamic processing (Parrot Magic Cookies).

Both Parrot and PyPy are designed to create a platform which create a common dynamic languages runtime, but PyPy wants more - also to create more VM.
Where is the sens of PyPy? For what we need to create more VM? Shouldn't be better to focus on one VM (like in parrot) - because there is common one code level - either PyPy internal bytecode or Parrot ones. I think we can't gain nothing better to translate to PyPy bytecode to newly created with PyPy VMs.

  • LLVM - i see this very similar to PyPy but without VM generator.
    It is mature, well designed environment with similar targets as PyPy (but without VM generator) but working on low level structure and great optimization/JIT techniques implemeted

Is see this as: LLVM is general use, but Parrot and **PyPy* designed for dynamic languages. In PyPy / Parrot is more easy to introduce some complicated techniques - because it is more high level then LLVM - like sophisticate compiler which can better understand high level code and produce better assembler code (which humans can't write in reasonable time), then the LLVM one?

Questions:

  1. Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?

  2. I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy

  3. Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.

  4. I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?

  5. What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?

  6. Can you correct me if i'm wrong somewhere

Some addings:

=============

CLARIFICATION

I want to figure how all this software consist - and what is the problem to porting one to other.


Solution

  • That not stuff anybody can possible answer in a stackoverflow questions but i give it a minmal shot.

    First what problems do the 3 projects solve?

    1. pypy allows you to implement an interpreter in a high level language and you get a generated jit for free. The good thing about this is that you don't have a dependence mismatch between the langauge and the platform. Thats the reason why pypy-clr is faster then IronPython. More info here: http://codespeak.net/pypy/dist/pypy/doc/extradoc.html --> High performance implementation of Python for CLI/.NET with JIT compiler generation for dynamic)

    2. llvm is a low level infrastructure for compilers. The general idea is to have one "high level assembly". All the optomizations work on that language. Then there is tons of infrastructure around to help you build compilers (JIT or AOT). Implementing a dynamic language on llvm is possible but needs more work then implementing it on pypy or parrot. You, for example, can't get a GC for free (there are GC you can use together with LLVM see http://llvm.org/devmtg/2009-10/ --> the vmkit video ) There are attempts to build a platform better for dynamic langauges based on llvm: http://www.ffconsultancy.com/ocaml/hlvm/

    3. I don't know that much about parrot but as I understand they want to build one standard VM specialized for dynamic langauges (perl, php, python ....). The problem here is the same as with compiling to JVM/CLR there is a dependency missmatch, just a much smaller one. The VM still does not know the semantics of your langauge. As I unterstand parrot is still pretty slow for user code. (http://confreaks.net/videos/118-elcamp2010-parrot)

    The answer to your question:

    Am I right? Is there any reason that porting some dynamic language would be better to llvm then to for example Parrot?

    Thats a question of effort. Building everthing your self and specialized for you will eventually be faster but it's a LOT more effort.

    I haven't see the activity on development python on Parrot. Is it because using python C extensions doesn't work on parrot? The same problem is in PyPy.

    Targeting parrot would (at this point) not likely have a benefit over pypy. Why nobody else does it I don't know.

    Why other VMs don't want to move to LLVM / parrot. Eg ruby -> parrot, CLR/ JVM -> LLVM. Wouldn't be better for them to move to more sophisticated solution? LLVM is in high development process and has big companies investing in.

    Ok there is a lot of stuff in that question.

    • Like I said LLVM is hard to move to and parrot is not that fast (correct me if im wrong).
    • Ruby has Rubinius witch tries to do a lot in ruby and jits to llvm (http://llvm.org/devmtg/2009-10/ --> Accelerating Ruby with LLVM).
    • There is a implementation of CLR/JVM on LLVM but they both already have very mature implemantations that have big investments.
    • LLVM is not high level.

    I know the problem might be in recompile are resources, if there is need to change bytecode - but it is not obligatory - as we can try to port old bytecode to new one, and new compilers produce new bytecode (never less java still need to interpreted own bytecode - so the frontend can check it and translate it to new bytecode)?

    I have no idea what the question is.

    What are the problems with linking for example jvm libraries inside llvm (if we port somehow java/jvm/scala to llvm)?

    Watch the video of VMKit I linked above that show how far they got and what the problem is (and how they solved it).

    Can you correct me if i'm wrong somewhere

    Lots of stuff you wrote is wrong or I just don't anderstand what you mean, but the stuff I linked should make a lot of stuff clearer.


    Some examples:

    Clojure
    

    The creater didn't want all the work of implementing his own vm and all the libraries. So where to go to? Since Clojure is a new langauge you can build it in a way that works well on a platform like the JVM by restricting a lot of dynamic stuff a language like python or ruby would have.

    Python
    

    The language can't (practically) be changed to work well on JVM/CLR. So implementing python on those wont bring massive speedups. Static compiler won't work very well either because there are not many static guarantees. Writing a JIT in C will be fast but very hard to change (see the psyco project). Using the llvm jit could work and is explored by the Unladen Swallow project (again http://llvm.org/devmtg/2009-10/ --> Unladen Swallow: Python on LLVM). Some people wanted to have python in python so they started pypy and their idea seams to work really well (see above). Parrot could work as well but I have not seen anybody have try (feel free).


    On everything:

    I think you're confused and I can understand that. Take your time and read, listen, watch everything you can get. Don't stress yourself. There are a lot of parts to this and eventually you see how what fits together and what makes sense and even when you know a lot there is still a lot of discussing one may do. The question is where to implement a new language or how to speed up a old language have many answers and if you ask 3 people you're likely to get three different answers.