I have a multi-column-level dataframe. I need to drop all the column levels that don't have a name (i.e. None
). I don't know in advance which levels are affected (if any at all).
What's a pythonic way to do this?
import pandas as pd
import numpy as np
# Create a MultiIndex for the columns with None values in names
index = pd.MultiIndex.from_tuples([
('A', 'X', 'I'),
('A', 'X', 'II'),
('A', 'Y', 'I'),
('A', 'Y', 'II'),
('B', None, 'I'),
('B', None, 'II'),
('B', 'Z', 'I'),
('B', 'Z', 'II'),
], names=[None, 'level_2', None])
# Create a DataFrame with random values
data = np.random.randint(0, 10, (8, 8)) # Corrected shape to match the MultiIndex
df = pd.DataFrame(data, columns=index)
print(df)
A B
level_2 X Y NaN Z
I II I II I II I II
0 5 0 3 5 0 9 6 6
1 6 2 6 3 7 1 8 0
2 7 5 1 0 3 2 9 5
3 8 6 1 5 9 0 8 2
4 4 0 0 2 4 7 0 6
5 3 8 0 5 4 8 6 8
6 2 4 2 7 4 6 5 3
7 6 5 0 3 1 6 0 4
I am looking for this result:
level_2 X Y NaN Z
0 5 0 3 5 0 9 6 6
1 6 2 6 3 7 1 8 0
2 7 5 1 0 3 2 9 5
3 8 6 1 5 9 0 8 2
4 4 0 0 2 4 7 0 6
5 3 8 0 5 4 8 6 8
6 2 4 2 7 4 6 5 3
7 6 5 0 3 1 6 0 4
You could do:
df = df.droplevel([idx for idx, name in enumerate(df.columns.names) if name is None], axis=1)