Here is my data at below,
So in my activity description column I have many charges.
Some string contain pattern like charge, charges, or nothing.
So at first, 1. I need to find for pattern named charge and replace with charges.
But for 2 of the charges named container charges and store charges I need to name as charge instead of charges. Ex. Container charge not container charges.
If no pattern named charge is present I need to place charges at end of the string.
For Ques 1, I tried below code in R,
df$Activity description = gsub("*charge","charges",df$Activity description)
But it replacing additional s in the output as Ex. Chargess. I dont know why.
For ques 2 and 3, I dont know how to start.
Can anyone help me on this.
First, I highly recommend you use headers without spaces (ex. Activity_description).
Next, you probably want to use a series of if-else statements:
new_column <- c()
for (line in df$Activity_description){
# check for the two specific cases
if (line == "Container Tracking Charges"){
new_column <- c(new_column, "Container Tracking Charge")
} else if (line == "Store Tracking Charges"){
new_column <- c(new_column, "Store Tracking Charge")
} else if (grepl("Charge$", line)){
new_column <- c(new_column, paste(line,"s",sep=""))
} else if (! grepl("Charge", line)){
new_column <- c(new_column, paste(line,"Charges"))
} else {
new_column <- c(new_column, line)
}
}
You may then set the original column using the new character vector:
df$Activity_description <- new_column
This may be a bit simple since it's done in base R, but it should at least get you started.