Search code examples

Windows Automatic naming from info in PDF file itself

I am trying to find out a way to take scanned pdfs that are automatically named things like "397009900" to a certain string inside the PDF itself. In my case it is a drawing name that I am trying to extract from the PDF to rename the file ie "ISO-4024-4301".

Is there a way to automatically rename a PDF file with information from inside of it?

Thanks very much.


  • This can be done with python.

    import PyPDF2
    with open('path_to_file\Test doc.pdf', 'rb') as p:
        pdfReader = PyPDF2.PdfFileReader(p)
        pageObj = pdfReader.getPage(0)

    You can specify the page number where you want to extract the information. Change page number from 0 where you want to extract.

    pageObj = pdfReader.getPage(0)

    The extracted texts will be stored in the variable info, then you can perform any operation to choose the required text you want to rename to.

    import os

    With OS module, you can easily rename the files!