Search code examples
pythonmacoslxmlpypy

How can I set up lxml and pypy on Yosemite?


I wanted to do some learning with lxml and pypy, so I decided to get it set up on my Yosemite Mac. But after three days of trying, I still haven't been able to try lxml, because I can't get my setup right.

Here's what I've done:

  1. Did a clean homebrew and xcode-select --install install

    proix:~ user$ brew --version
    0.9.5
    
    proix:~ user$ gcc --version
    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
    Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
    Target: x86_64-apple-darwin14.0.0
    Thread model: posix
    
  2. Brewed up libxml2 and libxslt - libxml2 code tar 3.4.1 and libxslt code tar 1.1.28; worked fine. Libs were built and installed.

    proix:~ user$ brew list
    libxml2 libxslt
    
    proix:~ user$ brew info
    2 kegs, 409 files, 14M
    
    proix:~ user$ ll /usr/local/Cellar/libxml2/2.9.2/lib/
    total 6096
    drwxr-xr-x   8 user  admin      272 27 Dez 11:46 .
    drwxr-xr-x  13 user  admin      442 27 Dez 11:46 ..
    drwxr-xr-x   3 user  admin      102 27 Dez 11:46 cmake
    -r--r--r--   1 user  admin  1184284 27 Dez 11:46 libxml2.2.dylib
    -r--r--r--   1 user  admin  1922024 27 Dez 11:46 libxml2.a
    lrwxr-xr-x   1 user  admin       15 27 Dez 11:46 libxml2.dylib -> libxml2.2.dylib
    drwxr-xr-x   3 user  admin      102 27 Dez 11:46 pkgconfig
    -r--r--r--   1 user  admin      269 27 Dez 11:46 xml2Conf.sh
    
    proix:~ user$ ll /usr/local/Cellar/libxslt/1.1.28/lib/
    total 1440
    drwxr-xr-x  10 user  admin     340 27 Dez 12:10 .
    drwxr-xr-x  13 user  admin     442 27 Dez 12:10 ..
    -r--r--r--   1 user  admin   76728 27 Dez 12:10 libexslt.0.dylib
    -r--r--r--   1 user  admin  101832 27 Dez 12:10 libexslt.a
    lrwxr-xr-x   1 user  admin      16 27 Dez 12:10 libexslt.dylib -> libexslt.0.dylib
    -r--r--r--   1 user  admin  214344 27 Dez 12:10 libxslt.1.dylib
    -r--r--r--   1 user  admin  326040 27 Dez 12:10 libxslt.a
    lrwxr-xr-x   1 user  admin      15 27 Dez 12:10 libxslt.dylib -> libxslt.1.dylib
    drwxr-xr-x   4 user  admin     136 27 Dez 12:10 pkgconfig
    -r--r--r--   1 user  admin     288 27 Dez 12:10 xsltConf.sh
    
  3. But these new versions aren't being used:

    $ xmllint --version
    xmllint: using libxml version 20900
    
  4. So I switched the libs under /usr/lib via the Recovery console (cmd+R during boot). After rebooting I get the expected result:

    $ xmllint --version
    xmllint: using libxml version 20902`
    

    A word of Warning! Do not attempt to do this during a normal login session. It utterly renders you system useless, if the OS cannot find libxml2.dylib any longer.

  5. Create a virtualenv for testing:

    virtualenv lxmllab
    source lxmllab/bin/activate`
    
  6. Install lxml with STATIC_DEPS=true sudo pip install lxml. Worked fine as well:

    (lxmllab)proix:~ user$ pip list
    backports.ssl-match-hostname (3.4.0.2)
    certifi (14.5.14)
    cffi (0.6)
    docutils (0.12)
    ipython (2.3.1)
    Jinja2 (2.7.3)
    lxml (3.4.1)
    MarkupSafe (0.23)
    nose (1.3.4)
    numpydoc (0.5)
    pip (6.0.3)
    py (1.4.26)
    Pygments (2.0.1)
    pyzmq (14.4.1)
    setuptools (8.2.1)
    Sphinx (1.2.3)
    tornado (4.0.2)
    
  7. Test it:

    (lxmllab)proix:~ user$ pypy -c 'from lxml import etree'
    Unknown libxml2 version: 20902
    Traceback (most recent call last):
      File "app_main.py", line 72, in run_toplevel
      File "app_main.py", line 562, in run_it
      File "<string>", line 1, in <module>
      File "lxml.etree.pyx", line 270, in init lxml.etree (src/lxml/lxml.etree.c:199039)
      File "lxml.etree.pyx", line 235, in lxml.etree.__unpackDottedVersion (src/lxml/lxml.etree.c:9383)
    TypeError: unsupported operand type for int(): 'unicode'
    
    (lxmllab)proix:~ user$ pypy
    Python 2.7.3 (5acfe049a5b0, May 21 2013, 13:47:22)
    [PyPy 2.0.2 with GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    And now for something completely different: ``redefining yellow seems like a
    better idea''
    ---- from lxml import etree
    Unknown libxml2 version: 20902
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lxml.etree.pyx", line 270, in init lxml.etree (src/lxml/lxml.etree.c:199039)
      File "lxml.etree.pyx", line 235, in lxml.etree.__unpackDottedVersion (src/lxml/lxml.etree.c:9383)
    TypeError: unsupported operand type for int(): 'unicode'
    ---- 
    

That's where I got stuck. I tried a couple of fixes, to no avail:

  • Setting LD_LIBRARY_PATH and/or DYLD_LIBRARY_PATH to the locations of libxml2.
  • Copying the libxml2 dylibs to virtualenv site-packages/lxml folder.

Does anybody know what I should do to get this to work, or what the correct way of getting the lxml lib working under Yosemite?


Solution

  • PyPy does not work with lxml (at least not very well, even if it accidentally does), due to lxml being built on top of Cython which uses CPython C API bindings. Consider using lxml-cffi instead https://github.com/amauryfa/lxml/tree/cffi