Search code examples
pythonpdfutf-8txtpyfpdf

Python3 fpdf is giving me an error latin-1 codec can't encode character


When I run the code below I get the following traceback:

Traceback (most recent call last):
  File "C:\demo\test.py", line 11, in <module>
    pdf.output("splintered.pdf")
  File "C:\demo\lib\site-packages\fpdf\fpdf.py", line 1065, in output
    self.close()
  File "C:\demo\lib\site-packages\fpdf\fpdf.py", line 246, in close
    self._enddoc()
  File "C:\demo\lib\site-packages\fpdf\fpdf.py", line 1636, in _enddoc
    self._putpages()
  File "C:\demo\lib\site-packages\fpdf\fpdf.py", line 1170, in _putpages
    p = self.pages[n].encode("latin1") if PY3K else self.pages[n]
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 74: ordinal not in range(256)

How do I fix this? Is it because I chose Arial as my font choice? All I am attempting to do is convert a txt to pdf file so if there are any easier ways to do this within Python I would be grateful.

import fpdf
pdf = fpdf.FPDF(format='letter')

txt = 'bees and butterflies. I’m not picky. Once they get chatty, they’re fair'

pdf.add_page()
pdf.set_font('Arial', '', 12)
pdf.multi_cell(0, 5, txt,0,'R')
pdf.ln()
pdf.cell(0, 5, 'End')
pdf.output("splintered.pdf")

Solution

  • You need to add a Unicode font supporting the code points of the language to the PDF. The code point U+2019 is RIGHT SINGLE QUOTATION MARK() and is not supported by the Latin-1 encoding. For example:

    import fpdf
    
    pdf = fpdf.FPDF(format='letter')
    
    txt = 'bees and butterflies. I’m not picky. Once they get chatty, they’re fair'
    
    pdf.add_page()
    pdf.add_font('Arial', '', 'c:/windows/fonts/arial.ttf', uni=True)  # added line
    pdf.set_font('Arial', '', 12)
    pdf.multi_cell(0, 5, txt,0,'R')
    pdf.ln()
    pdf.cell(0, 5, 'End')
    pdf.output("splintered.pdf")
    

    Output:

    Correct PDF output

    See https://pyfpdf.readthedocs.io/en/latest/Unicode/index.html for more language examples.