Search code examples
htmlparsingepub

HTML spidering to ePub conversion tool


It appears that there is currently no tool available for spidering a site and converting the contents to ePub format. I suppose there are legal implications on performing this action on a site without express consent from the site owner.

The reason I ask, is that I would like to be able to convert the Doctrine 2 reference guide into ePub format for my Kindle.


Solution

  • It's often better to find out what the documentation sources are and use those. In the case of the doctrine 2 they are RST text files. The docutils tools (written in Python) converts these to various forms, such as the website you see. The Sphinx documentation builder builds on that, and that appears to be what they use. It so happens that they have an ePub builder.

    Since it's an open source project the sources are readily available. Or you can get the latest from their git repository git clone git://github.com/doctrine/doctrine2.git doctrine2-orm.

    This is the easiest, most direct route to getting the documentation on your Kindle.