Search code examples
pythonpython-3.xpathpypdf

Address path issues while opening a pdf file in python


I am trying to open a pdf file using python in jupyter notebook which is on desktop and the path is something like this: C:\Users\laxmi prasad\Desktop\ and it's showing error.

   import PyPDF2
   red_ball = open('C:\Users\laxmi prasad\Desktop\Neeraj Kasturi_mystery','rb')

and the error it shows is

    File "<ipython-input-5-565b4f1ccaec>", line 1
    red_ball = open('C:\Users\laxmiprasad\Desktop\Neeraj Kasturi_mystery','rb')
                 ^
    SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in 
    position 2-3: truncated \UXXXXXXXX escape

I think it's the space between inside the path between two words, but that's the folder name. Can anyone help me understand this issue ?


Solution

  • The Problem with your string is not that it contains spaces. The problem is the \U in it.

    The backslash is the escape character in the string, initiating "special chars in a string, that cannot be entered directly, like linebreaks. E.g. is \U the beginning of the declaration of an unicode character, like \U00001234.

    To use a backslash in a string, use two backlashes, like so:

    red_ball = open('C:\\Users\\laxmi prasad\\Desktop\\Neeraj Kasturi_mystery','rb')
    

    Or you can use so called raw strings, by adding an r before the string literal, like so:

    red_ball = open(r'C:\Users\laxmi prasad\Desktop\Neeraj Kasturi_mystery','rb')
    

    You can use forward slashes instead of the backslashes, but then you cannot simply copy and paste paths on Windows Systems, e.g. between Explorer and Python code.

    This applies to many programming languages, not only python.