I am trying to rename files using information obtained through PdfFileReader
, from the PyPDF2
library. Sometimes, the information (in this case the title obtained with reader.metadata.title
contain backslashes ("/"), which disrupt the renaming process as they are considered directory levels in the path I indicate in os.rename()
as destinations paths. I have tried to replace the backslashes with "-" by applying the os.replace()
method on the strings obtained but for some reason, this doesn't work resulting in a FileNotFoundError
when I try to rename. I have double checked and the type of the variable containing reader.metadata.title
is str
, so in theory os.replace()
method should successfully apply. Is the "TOC/TOC
" shown in my output example below some sort of encoding that needs to be dealt with differently? Thanks.
My code:
from PyPDF2 import PdfReader
for pdf_file in os.listdir(downloads_path):
if pdf_file.endswith(".pdf"):
current_file_path = os.path.join(downloads_path, pdf_file)
reader = PdfReader(open(current_file_path, "rb"))
new_name_pdf_file = reader.metadata.title
new_name_pdf_file.replace("/", "-")
# output example: 'Outside Back Cover - Graphical abstract TOC/TOC in double column/Cover image legend if applicable, Bar code, Abstracting and Indexing information'
print(new_name_pdf_file)
new_pdf_destination = os.path.join(destination_path, new_name_pdf_file)
os.rename(current_file_path, new_pdf_destination)
Output error example:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/me/Documents/temporary_downloads_folder/Outside-Back-Cover---Graphical-abstract-TOC-TOC-in-double-column-C_2022_Nano.pdf' -> '/Users/me/Documents/destination_folder/Outside Back Cover - Graphical abstract TOC/TOC in double column/Cover image legend if applicable, Bar code, Abstracting and Indexing information.pdf'
The line
new_name_pdf_file.replace("/", "-")
doesn't do what you think it does. It does not change the string new_name_pdf_file
points to. In fact: It can't do that. Strings are immutable in python. They cannot be changed. Instead, it creates a new string with the replacement done.
Change the line to
new_name_pdf_file = new_name_pdf_file.replace("/", "-")
and it should work.