Search code examples
pythonmatplotlibgnucashpiecash

Python and GnuCash: Extract data from GnuCash files


I'm looking for information on how to read GnuCash files using python. I have read about this python-gnucash which provides Python bindings to the GnuCash library, but it takes a lot of work at the moment (e.g. dependencies, headers, etc.). The instructions are tailored for the Linux environment, and a rather old GnuCash version (2.0.x). I am running GnuCash 2.2.9. Though I can operate the Linux command line, I am running GnuCash on Windows XP.

My main objective is to read (no plans to write yet) my GnuCash files so that I can create my own visual dynamic reports using matplotlib and wxpython. I'm not yet in the mood to learn Scheme.

I hope someone can point me to a good start on this. As far as I know about GnuCash and Python, I think someone probably knows solutions of the following types:

  1. More recently updated documentation aside from this one from the GnuCash wiki
  2. Some workaround, like exporting to a certain file format for which there is a more mature Python library that can read it.

You guys might have better suggestions in addition to those mentioned.


Solution

  • Are you talking about the data files? From there wiki, it looks like they are just compressed XML files. WIth Python, you can decompress them with the gzip module and then parse them with any of the available XML parsers.

    ElementTree Example

    >>> import xml.etree.cElementTree as ET
    >>> xmlStr = '''<?xml version="1.0" encoding="UTF-8" ?>
    <painting>
    <img src="madonna.jpg" alt='Foligno Madonna, by Raphael'/>
    <caption>This is Raphael's "Foligno" Madonna, painted in
         <date>1511</date>?<date>1512</date>.
    </caption>
    </painting>
    '''
    >>> tree = ET.fromstring(xmlStr)  #use parse or iterparse to read direct from file path
    >>> tree.getchildren()
    [<Element 'img' at 0x115efc0>, <Element 'caption' at 0x1173090>]
    >>> tree.getchildren()[1].text
    'This is Raphael\'s "Foligno" Madonna, painted in\n    '
    >>> tree.getchildren()[0].get('src')
    'madonna.jpg'