Search code examples
pythonpython-3.xpandasdataframeread-write

select specific rows in text file to read into two separate columns in a pandas dataframe


I have a text file like the following

2022-mary
['apple', 'oranges', 'pineapple']
2022-cindy
['mango', 'banana', 'berry']
2022-andrew
['coconut', 'cabbage']

I would like to produce a pandas dataframe with 2 columns from the text file above that looks like this

    user            fruits
0   2022-mary      ['apple', 'oranges', 'pineapple']
1   2022-cindy     ['mango', 'banana', 'berry']
2   2022-andrew    ['coconut', 'cabbage']

I wonder if there is an efficient way to accomplish this with pd.read_csv() functions.


Solution

  • Create one column DataFrame and then selecting even and odd values:

    df = pd.read_csv(file, sep="|", names=['data'])
    
    df = pd.DataFrame({'user': df.data.iloc[::2].to_numpy(),
                       'fruits': df.data.iloc[1::2].to_numpy()})
    

    Or create list:

    with open(file) as f:
        lines = f.read().splitlines()
    
    df = pd.DataFrame({'user': lines[::2], 'fruits': lines[1::2]})