Search code examples
pythonpdfrenamepypdf

renaming a list of pdf files with for loop


i am trying to rename a list of pdf files by extracting the name from the file using PyPdf. i tried to use a for loop to rename the files but i always get an error with code 32 saying that the file is being used by another process. I am using python2.7 Here's my code

import os, glob
from pyPdf import PdfFileWriter, PdfFileReader

# this function extracts the name of the file
def getName(filepath):
    output = PdfFileWriter()
    input = PdfFileReader(file(filepath, "rb"))
    output.addPage(input.getPage(0))
    outputStream = file(filepath + '.txt', 'w')
    output.write(outputStream)
    outputStream.close()

    outText = open(filepath + '.txt', 'rb')
    textString = outText.read()
    outText.close()

    nameStart = textString.find('default">')
    nameEnd = textString.find('_SATB', nameStart)
    nameEnd2 = textString.find('</rdf:li>', nameStart)

    if nameStart:
        testName = textString[nameStart+9:nameEnd]
        if len(testName) <= 100:
            name = testName + '.pdf'
        else:
            name = textString[nameStart+9:nameEnd2] + '.pdf'
    return name


pdfFiles = glob.glob('*.pdf')
m = len(pdfFiles)
for each in pdfFiles:
    newName = getName(each)
    os.rename(each, newName)

Solution

  • You're not closing the input stream (the file) used by the pdf reader. Thus, when you try to rename the file, it's still open.

    So, instead of this:

    input = PdfFileReader(file(filepath, "rb"))
    

    Try this:

    inputStream = file(filepath, "rb")
    input = PdfFileReader(inputStream)
    (... when done with this file...)
    inputStream.close()