Search code examples
pandascsvline-continuation

Trouble shooting line continuation error for a long file path for read_csv


I am trying to break up a long file path so that I can read it without having to move the screen to see it.

edgelist = pd.read_csv(r'https://gist.githubusercontent.com/brooksandrew' /
                   r'/e570c38bcc72a8d102422f2af836513b/raw' /
                   r'/89c76b2563dbc0e88384719a35cba0dfc04cd522' / 
                   r'/edgelist_sleeping_giant.csv')

However, I get this error:

TypeError                                 Traceback (most recent call last)
<ipython-input-4-a0ff45f0f7db> in <module>
      2 edgelist = pd.read_csv(r'https://gist.githubusercontent.com/brooksandrew' /
      3                        r'/e570c38bcc72a8d102422f2af836513b/raw' /
----> 4                        r'/89c76b2563dbc0e88384719a35cba0dfc04cd522' /
      5                        r'/edgelist_sleeping_giant.csv')
      6 edgelist.head(10)

I've looked at some other stack posts, but I don't understand them. I've tried a variety of combinations of removing the forward slash with repositioning the quotes, but I think I'm just grasping at straws. I would love a technical explanation to why I'm getting this error.

BTW, writing the load statement on one line with no ending [isolated] forward slashes (on lines 2, 3, and 4) works, but I can't see the entire statement without sliding the screen view. I'm looking for something readable in one view.


Solution

  • Line continuations in Python are signaled with backward slashes, you have been using forward slashes.

    This should work as intended:

    edgelist = pd.read_csv(r'https://gist.githubusercontent.com/brooksandrew' \
                           r'/e570c38bcc72a8d102422f2af836513b/raw' \
                           r'/89c76b2563dbc0e88384719a35cba0dfc04cd522' \
                           r'/edgelist_sleeping_giant.csv')
    

    As there are no backslashes in the URL itself, you don't need to use raw string literals, and can just use standard string literals:

    edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew' \
                           '/e570c38bcc72a8d102422f2af836513b/raw' \
                           '/89c76b2563dbc0e88384719a35cba0dfc04cd522' \
                           '/edgelist_sleeping_giant.csv')
    

    You can even remove the quotes, but then all the spaces need to go as well, as they would become part of the resulting string (and won't be a correct URL anymore):

    edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew\
    /e570c38bcc72a8d102422f2af836513b/raw\
    /89c76b2563dbc0e88384719a35cba0dfc04cd522\
    /edgelist_sleeping_giant.csv')