I have the following dataframe:
Person Number Error Department Name Email
Country
CZ 10054609 The identifier 11380151 is used by Veronika Fi... CZ:Supply Chain Pohořelice 1 Henkel Cosmeticos... verca.fialova.2001@gmail.com
CZ 10054620 The identifier 11380126 is used by Radmila Val... CZ:Supply Chain Pohořelice 1 Henkel VAS (CZM63... rvalova1@seznam.cz
CZ 10054728 The identifier 11805326 is used by Pavel Pecka... CZ:Supply Chain Pohořelice 3 Levis (CZM630.415... pavlias000@seznam.cz
CZ 10054699 The identifier 11380232 is used by Sabina Love... CZ:Supply Chain Pohořelice 3 Marks and Spencer... s.loveckova@seznam.cz
CZ 10054727 The identifier 11805358 is used by Tereza Holč... CZ:Supply Chain Pohořelice 3 Levis (CZM630.415... tholcapko@seznam.cz
I need to create a column named "Error Type" that follows the condition:
What would be the best way to solve it?
EDIT:
If there is many different values create dictionary for mapping and set values in loop:
df=pd.DataFrame({'Error':['The Identifier 1','The Identifier 3','The data dd','another data']})
#add all possible values
mapping = {'The Identifier': 'Duplicated','The data':'Transaction'}
df['Error'] = df['Error'].str.strip()
for k, v in mapping.items():
df.loc[df['Error'].str.startswith(k), 'new'] = v
print (df)
Error new
0 The Identifier 1 Duplicated
1 The Identifier 3 Duplicated
2 The data dd Transaction
3 another data NaN