I'm trying to follow the answer for this StackOverflow question: Pandas: combining header rows of a multiIndex DataFrame because I have the same need.
I've put this data into foobar.txt:
first,bar,baz,foo
second,one,two,three
A,1,2,3
B,8,9,10
I want to create a dataframe that looks like this:
first-second bar-one baz-two foo-three
A 1 2 3
B 8 9 10
I'm following the first answer to the linked question, which uses list comprehension, so my entire code looks like this:
import pandas as pd
df = pd.read_csv(r'C:\Temp\foobar.txt')
df.columns = [f'{i}{j}' for i, j in df.columns]
However, I get a "too many values to unpack error":
Exception has occurred: ValueError
too many values to unpack (expected 2)
File ".\test.py", line 32, in <listcomp>
df.columns = [f'{i}{j}' for i, j in df.columns]
File ".\test.py", line 32, in <module>
df.columns = [f'{i}{j}' for i, j in df.columns]
I've looked at other examples where folks hit the same error and I'm certain it's due to the fact that I have more than 2 values from df.columns, but I'm not sure how to fix that, nor do I understand why the answer I linked to above doesn't hit this problem.
You have to read the CSV by specifying the header rows to get MultiIndex
df = pd.read_csv(r'C:\Temp\foobar.txt', header=[0,1])
df.columns
MultiIndex([('first', 'second'),
( 'bar', 'one'),
( 'baz', 'two'),
( 'foo', 'three')],
)
df.columns = [f'{i}{j}' for i, j in df.columns]
df.columns
Index(['firstsecond', 'barone', 'baztwo', 'foothree'], dtype='object')