I have 200 plus columns by 10000 plus rows. I am trying to find the name of the column that contains a specific value in the row. In the example below, the specific value is in the value column. How I identified the value in the 'Value' column is not important.
Example: The Value_Col is the value I am trying to return.
Date Time A B C D E F Value Value_Col
Jan1 1245 3.0 3.2 4.6 5.7 2.1 8.0 5.7 D
Jan2 1045 4.5 8.4 3.9 2.2 9.4 8.3 3.9 C
Jan3 1350 1.4 3.3 4.5 8.9 1.4 0.4 1.4 A
I want to search only columns A through F and find the column name for the first instance (leftmost) the value exists. In the example, my value of interest appears twice in the row beginning with the Date of Jan3. I want to basically index the column names the value appears and select the first one. I understand this would be index 0 (or [0]) based on the list that is returned for the value search.
The example above is only a small subset of data. I currently have a list of all column names I want the value search to occur in. The value of interest can occur through many of the columns in the same row.
I want to search only columns A through F and find the column name for the first instance (leftmost) the value exists
You can use idxmax
on axis=1
after comparing Value
column with the slice of the datframe (using .loc[]
)
df['Value_Col'] = df.loc[:,'A':'F'].isin(df['Value']).idxmax(1)
print(df)
Date Time A B C D E F Value Value_Col
0 Jan1 1245 3.0 3.2 4.6 5.7 2.1 8.0 5.7 D
1 Jan2 1045 4.5 8.4 3.9 2.2 9.4 8.3 3.9 C
2 Jan3 1350 1.4 3.3 4.5 8.9 1.4 0.4 1.4 A
If there are chances that none of the column may contain the df['Value]
value , you can use:
m = df.loc[:,'A':'F']
df['Value_Col'] = m.isin(df['Value']).dot(m.columns).str[0]