Search code examples
pythonpandastranspose

Create a dataframe from a csv containing only rows


I have a csv file like:

feature1
feature2
feature3
f1_v1
f2_v1
f3_v1
f1_v2
f2_v2
f3_v2
...

I want to get a dataframe like this:

    feature1 feature2 feature3
0   f1_v1    f2_v1    f3_v1
1   f1_v2    f2_v2    f3_v2
...

How can I do that?


Solution

  • Read it into a list of lists and convert after. I'm not sure why so many people are afraid of preprocessing their data to allow it to fit into pandas, but that should be a common strategy.

    import pandas as pd
    
    headers = []
    rows = []
    
    for line in open('x.csv'):
        line = line.strip()
        if len(headers) < 3:
            headers.append(line)
            continue
        if not rows or len(rows[-1]) == 3:
            rows.append([])
        rows[-1].append(line)
    
    df = pd.DataFrame(rows, columns=headers)
    print(df)
    

    Output:

      feature1 feature2 feature3
    0    f1_v1    f2_v1    f3_v1
    1    f1_v2    f2_v2    f3_v2