I'm not really sure how to explain this one but I have a spreadsheet that i pulled into python with no headers, and the data pertaining to each unique id is grouped into rows of three so the columns at this point don't really mean anything when read until it's 'flattened' out. The formatting is consistent but I'm having a hard time figuring out what I need to search in order to get the output I'm looking for.
If this was in Excel I would basically be copying rows two and three to the right of row 1, then delete rows 2 and 3 and repeat until i get to the bottom of the sheet.
This is how i would receive the data
id1 criteria1 criteria2 criteria3
criteria4 criteria5 criteria6 criteria7
criteria8 criteria9 criteria10 criteria11
id2 criteria1 criteria2 criteria3
criteria4 criteria5 criteria6 criteria7
criteria8 criteria9 criteria10 criteria11
id3 criteria1 criteria2 criteria3
criteria4 criteria5 criteria6 criteria7
criteria8 criteria9 criteria10 criteria11
this is what i'm trying to achieve
id1 criteria1 criteria2 criteria3 criteria4 criteria5 criteria6 criteria7 criteria8 criteria9 criteria10 criteria11
id2 criteria1 criteria2 criteria3 criteria4 criteria5 criteria6 criteria7 criteria8 criteria9 criteria10 criteria11
id3 criteria1 criteria2 criteria3 criteria4 criteria5 criteria6 criteria7 criteria8 criteria9 criteria10 criteria11
See if this gets you going on the right track. Caveat: It's difficult to provide a full solution given the ambiguity of the source data (as mentioned in the comments) ... but give this a go.
Using your sample input, I converted it to a simple CSV file.
id1,criteria1,criteria2,criteria3
criteria4,criteria5,criteria6,criteria7
criteria8,criteria9,criteria10,criteria11
id2,criteria1,criteria2,criteria3
criteria4,criteria5,criteria6,criteria7
criteria8,criteria9,criteria10,criteria11
id3,criteria1,criteria2,criteria3
criteria4,criteria5,criteria6,criteria7
criteria8,criteria9,criteria10,criteria11
Note: The reshape values of [3, 12]
(or [rows, columns]
) will need to be modified based on the shape of your input data.
import pandas as pd
array = pd.read_csv('folded.csv', header=None).to_numpy().reshape([3, 12])
pd.DataFrame(array).to_csv('unfolded.csv', index=False, header=False)
id1,criteria1,criteria2,criteria3,criteria4,criteria5,criteria6,criteria7,criteria8,criteria9,criteria10,criteria11
id2,criteria1,criteria2,criteria3,criteria4,criteria5,criteria6,criteria7,criteria8,criteria9,criteria10,criteria11
id3,criteria1,criteria2,criteria3,criteria4,criteria5,criteria6,criteria7,criteria8,criteria9,criteria10,criteria11