Search code examples
pythondata-sciencepython-dataset

Loading data from Windows path


I have downloaded the Lock5Data from https://cran.r-project.org/web/packages/Lock5Data/index.html for windows r-release: Lock5Data_3.0.0.zip and tried to load a dataset from there in a python script (with ChatGPT's help) as below:

import pyreadr

StudentSurvey = pyreadr.read_r('Lock5Data\data\StudentSurvey.rdata')['StudentSurvey']

print(StudentSurvey.head())

and I am getting this error:

 SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

How can I solve this issue?

I have tried specifying the path for the datasets in my local computer, but it's still the same.


Solution

  • When you use python on windows, you have to make sure to escape the backslashes on the paths. There are three ways to do that:

    1. Use double backslashes:
    'Lock5Data\\data\\StudentSurvey.rdata'
    
    1. Use a raw string: write an r in front of the string
    r'Lock5Data\data\StudentSurvey.rdata'
    
    1. Use forward slashes instead of backslashes
    'Lock5Data/data/StudentSurvey.rdata'
    

    Otherwise the single slash will be interpreted differently.

    I have downloaded the data and it is read correctly by pyreadr in my hands. Only thing is that for me it is called StudentSurvey.rda