I am initialising a dataframe with lists, having followed the advice here. I then need to transpose the dataframe.
In the first example I take the column names from the lists used to initialise the dataframe.
In the second example I add the column names last.
-> Is there any difference between these examples?
-> Is there a standard or better way of naming columns of dataframes initialised like this?
p_id = ['a_1','a_2']
p = ['a','b']
p_id.insert(0,'p_id')
p.insert(0,'p')
df = pd.DataFrame([p_id, p])
df = df.transpose()
df.columns = df.iloc[0]
df = df[1:]
df
>>>
p_id p
0 a_1 a
1 a_2 b
p_id = ['a_1','a_2']
p = ['a','b']
df = pd.DataFrame([p_id, p])
df = df.transpose()
df.columns = ['p_id', 'p']
df
>>>
p_id p
0 a_1 a
1 a_2 b
Yes, there is difference in indices:
print(df.equals(df1))
False
print (df.index)
RangeIndex(start=1, stop=3, step=1)
print (df1.index)
RangeIndex(start=0, stop=2, step=1)
print (df.index == df1.index)
[False False]
Solution is create defaul index in df
by DataFrame.reset_index
with drop=True
parameter:
df = df.reset_index(drop=True)
print(df.equals(df1))
True
print (df.index)
RangeIndex(start=0, stop=2, step=1)
print (df1.index)
RangeIndex(start=0, stop=2, step=1)
print (df.index == df1.index)
[ True True]