I have a dataframe that has a column (say "Total") with numeric data. The data in this column can be positive, negative, or zero. No range limit on either side of zero.
I wanted to create another column with specific indicators or categorical values based on value in this 'Total' column.
For example (Objectives):
As of now, I am doing this by creating a separate list of values by iterating through each row in 'Total' column through an if-else statement and then appending that list of values as a column to the dataframe.
for each in df['Total']:
values.append(cat1(each))
df['newcol'] = values
Here cat1 is the function that returns P/N/Z based on positive/negative/zero value in each. values is the list of values that I will create using this for-loop. Similarly, I have functions for 2 and 3 from the objectives above.
def cat1(value):
if value > 0:
return "P"
elif value < 0:
return "N"
else:
return "Z"
But I hope there can be a simpler and faster alternative?
Thank you for the help.
I don't know if this approach is any quicker but it definitely utilizes the pandas functionality a little better:
def cat1(value):
if value > 0:
return "P"
elif value < 0:
return "N"
else:
return "Z"
df[newcol] = df.apply(lambda row: cat1(row['Total']), axis = 1)