Search code examples
pythonfpdf

For loop to convert txt files to pdf - only working on first file


I have a folder of .txt files that I need to convert to pdf. I have successfully done this for 1 .txt file, but would like to do all 126 (preferably not manually, one at a time). I put my working code into a for loop to iterate through each file in the folder. The code runs, and there are no errors. However this is how it behaves:

First file: converts perfectly, no problems. Second file: converts the first file again, then converts the second one in one document. Third file, converts the first, second and third file in one document. etc.

Things I've tried:

  • I have tried to put in a f.close() in different places in an attempt to close the previous file before opening the next one, but that has not worked. (no errors, it just doesn't effect how the code behaves).
  • Altering the nested for loop (for x in f), and removing it completely, just having the open(file...) line and pdf.cell(200...) line in the main loop.

Here is my code:

pdf = FPDF()

pdf.add_page()
pdf.set_font("Arial", size = 8)

files = os.listdir('Convert') #this is the folder where all the .txt files are

for file in tqdm(files):
    dir_full_path = os.path.abspath('Convert')
    file_full_path = os.path.join(dir_full_path, file)
    output_filename = os.path.basename(file)+'.pdf'
    
    f = open(file_full_path, 'r+')
    
    for x in f: 
        pdf.cell(200,10, txt = x, ln = 1, align = 'L')
    
    pdf.output(output_filename)

Any ideas on how to resolve? TIA


Solution

  • While you say "I put my working code into a for loop to iterate through each file in the folder", you actually missed a spot.

    You need to create a new FPDF object at each iteration of your loop, otherwise, you are just appending lines to the same one. When you call pdf.output(output_filename), the result is indeed expected to contain all the lines from all the files you've processed so far.

    Here is an example that fixes the issue, and also shows how to correctly close the input file:

    files = os.listdir('Convert')
    dir_full_path = os.path.abspath('Convert')
    
    for file in tqdm(files):
        pdf = FPDF()
        pdf.add_page()
        pdf.set_font("Arial", size = 8)
    
        file_full_path = os.path.join(dir_full_path, file)
        with open(file_full_path, 'r+') as f:
            for line in f: 
                pdf.cell(200, 10, txt=line, l =1, align='L').
    
        output_filename = os.path.basename(file) + '.pdf'
        pdf.output(output_filename)