I have a dataframe where I want to get a single array of all of the values in the 'a' column, which is part of a multi-index dataframe. The code below works, but it is hard to read, write, and think about. Is there a more idiomatic way to express the same idea?
import numpy as np
import pandas as pd
x = pd.DataFrame({'a': [1, 2, 3], 'b': [1, 2, 3]})
y = pd.DataFrame({'a': [11, 12, 13], 'b': [21, 22, 23]})
df = pd.concat({'x': x, 'y': y}, axis=1)
x = np.concatenate(df.loc[:, (slice(None), 'a')].values)
df:
x y
a b a b
0 1 1 11 21
1 2 2 12 22
2 3 3 13 23
x:
[ 1 11 2 12 3 13]
Approach 1: xs
np.ravel(df.xs('a', level=1, axis=1))
Approach 2: stack
list(df.stack(0)['a'])
Result
[1, 11, 2, 12, 3, 13]