python pandas conditional-statements extract multiple-columns

pandas select multiple columns conditionally

Suppose I have a dataframe:

C1 V1 C2 V2 Cond
1  2  3  4  X  
5  6  7  8  Y  
9  10 11 12 X

The statements should return: if Cond == X, pick C1 and v1, else pick C2 and V2.

The output dataframe is something like:

** EDIT: To add one more requirement: the number of columns can change but follow some naming pattern. In this case select all columns with "1" in it, else with "2". I think the hard-coded solution might not work.

Solution

I try create more general solution with filter and numpy.where, for new column names use extract:

#if necessary sort columns
df = df.sort_index(axis=1)

#filter df by 1 and 2
df1 = df.filter(like='1')
df2 = df.filter(like='2')
print (df1)
   C1  V1
0   1   2
1   5   6
2   9  10

print (df2)
   C2  V2
0   3   4
1   7   8
2  11  12

#np.where need same shape of mask as df1 and df2
mask = pd.concat([df.Cond == 'X']*len(df1.columns), axis=1)
print (mask)
    Cond   Cond
0   True   True
1  False  False
2   True   True

cols = df1.columns.str.extract('([A-Za-z])', expand=False)
print (cols)
Index(['C', 'V'], dtype='object')

print (np.where(mask, df1,df2))
Index(['C', 'V'], dtype='object')
[[ 1  2]
 [ 7  8]
 [ 9 10]]

print (pd.DataFrame(np.where(mask, df1, df2), index=df.index, columns=cols))
   C   V
0  1   2
1  7   8
2  9  10