Search code examples
pythonpdfunicodereportlabgeraldo

Unicode characters are boxes in Geraldo/ReportLab generated PDF


I'm running into some Unicode related issues when generating PDF reports using Geraldo and ReportLab.

When Unicode strings containing Asian characters are passed into the report, they appear in the output PDF as black boxes. This example (http://dl.dropbox.com/u/2627296/report.pdf) was generated using the following code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from geraldo import Report, ReportBand, ObjectValue
from geraldo.generators import PDFGenerator

class UnicodeReport(Report):    
    title = 'Report'

    class band_detail(ReportBand):
        elements = [ObjectValue(attribute_name='name')]

if __name__ == '__main__':
    objects = [{'name': u'한국어/조선말'}, {'name': u'汉语/漢語'}, {'name': u'オナカップ'}]    
    rpt = UnicodeReport(queryset=objects)
    rpt.generate_by(PDFGenerator, filename='/tmp/report.pdf')

I'm using Python 2.7.1, Geraldo 0.4.14 and ReportLab 2.5. System is Ubuntu 11.04 64-bit. The .oy file is also UTF-8 encoded. The black boxes are visible when the PDF is viewed in Document Viewer 2.32.0, Okular 0.12.2 and Adobe Reader 9.

Any help is greatly appreciated, thanks.


Solution

  • You should specify the font name as in the official example "Additional Fonts". Use additional_fonts and default_style:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    from geraldo import Report, ReportBand, ObjectValue
    from geraldo.generators import PDFGenerator
    
    class UnicodeReport(Report):    
        title = 'Report'
        additional_fonts = {
            'wqy': '/usr/share/fonts/wqy-zenhei/wqy-zenhei.ttc'
        }
        default_style = {'fontName': 'wqy'}
    
        class band_detail(ReportBand):
            elements = [ObjectValue(attribute_name='name')]
    
    if __name__ == '__main__':
        objects = [{'name': u'한국어/조선말'}, {'name': u'汉语/漢語'}, {'name': u'オナカップ'}]    
        rpt = UnicodeReport(queryset=objects)
        rpt.generate_by(PDFGenerator, filename='/tmp/report.pdf')
    

    ObjectValue() also has a named parameter style:

    elements = [ObjectValue(attribute_name='name', style={'fontName': 'wqy'})]
    

    This font is open source and can be downloaded here: http://sourceforge.net/projects/wqy/files/ (I think it's shipped with Ubuntu 11.04)