Search code examples
pythonrpandasdataframefeather

Can I convert a .rda file to a pandas dataframe in python without using R?


I'm practicing my Python, specifically my numpy and pandas. I have some data (not mine) in .rda format that I want to import into python as a dataframe. However I don't use R, so I'm wondering if I can do this without fiddling around with the base file. From what I've seen on the site, feather has been recommended and so I've tried the following:

import feather
path = 'pathtomydata.rda'
df = feather.read_dataframe(path)

But this produces a "ArrowInvalid: Not a feather file" error, which makes it sound like I have to dig into the .rda file and make it into a .fea file first, which I'd rather not do as I imagine I'd have to install R. Many thanks in advance.


Solution

  • You can use pyreadr, which does not require R to be installed:

    import pyreadr
    
    result = pyreadr.read_r('pathtomydata.rda')
    
    # done! let's see what we got
    print(result.keys()) # let's check what objects we got
    df1 = result["df1"] # extract the pandas data frame for object df1
    

    More information here:

    https://github.com/ofajardo/pyreadr