Search code examples
pythonpandasdataframevalueerrorambiguous

How to prepend '-' character to elements in a pandas column containing particular characters? Getting `ValueError`


I have a Dataframe from a csv containing 'TransactionAmounts' from customers. The system that exported the reported appends a 'CR' to certain rows, as such: TransacAmt Column

I would like to add a negative sign infront of the values that do have a 'CR' in them and leave the rest as is. I would then later remove the CR and ideally would be left with only negative and positive values.

I am getting ValueError and str is not type str object errors.

I wrote:

ctr['TransactAmtNew'] = ctr['TransactAmt'].apply(lambda x: '-' + ctr.TransactAmt.astype(str) 
                                      if ctr['TransactAmt'].str.contains('CR') else x)
ctr.head(15)

which gives me the error ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I understand what the error means however do not know how to solve it. I did try with any() and all() to no avail.

I also tried using:

ctr.loc[ctr['TransactAmt'].str.contains('CR'), 'TransactionAmt'] = '-' + ctr['TransactAmt'].astype(str)

which fills in NaN for the rows the query doesn't evaluate to true.


Solution

  • This takes a bit of typecasting. You'll need to convert the TransactAmt column to a string, if it isn't already a string from your csv file. Then, you convert to a float - making sure to multiply by -1 if there is a "CR" in the entry. List comprehension here will do the trick.

    df["TransactAmt_clean"] = [-float(x.replace("CR", "")) if "CR" in x else float(x) for x in df["TransactAmt"].astype('str')]