I'm fairly new to R, and after taking a 2 hour free course on Youtube, I feel no better. I'm trying to learn so I hope someone can help me out! I feel close to the answer but here I am :D I have a dataset, and I've modified the two columns by editing them as strings (characters). They consists of first(1st column) and last names(2nd column) of people so I was ordered to remove punctuations therefore, had to edit them as strings.Now I'm unsure how to add them back into the dataframe. Here is where I'm at.
# FILE: Vaccine_CSV
# INSTALL AND LOAD PACKAGES
library(datasets) # Load base packages manually
# Use pacman to load add-on packages as desired
pacman::p_load(pacman, rio)
# Importing CSV from desktop
Vaccine_CSV <- import("~/Desktop/Vaccine CSV.csv")
# Summary
summary(Vaccine_CSV)
# Transform lowercases in data into upper case
Vaccine_CSV = as.data.frame(sapply(Vaccine_CSV, toupper))
Vaccine_CSV$FirstName
Vaccine_CSV$LastName
# Trim the spaces between the names
trimws(Vaccine_CSV$FirstName)
trimws(Vaccine_CSV$LastName)
# First and last names combined
FirstNameFixed<- Vaccine_CSV [, c(3)]
LastNFixed<- Vaccine_CSV [, c(4)]
# Trimming inside the first name column
FirstNameFixed <- gsub("\\-", "", FirstNameFixed)
FirstNameFixed <- gsub("\\s", "", FirstNameFixed)
FirstNameFixed <- gsub("\\'", "", FirstNameFixed)
# Trimming inside last name column
LastNFixed<- gsub("\\-", "", LastNFixed)
LastNFixed <- gsub("\\s", "", LastNFixed)
LastNFixed<- gsub("\\'", "", LastNFixed)
I think dplyr
package will be a friend here.
Once you have applied toupper, your code can be writen as shown:
library(dplyr)
Vaccine_CSV$FirstName <- trimws(.) %>% gsub("\\-", "",.) %>% gsub("\\s", "",.) %>% gsub("\\'", "",.)
and dataframe columns will be changed.
On the other hand, if you want to work with lists or vectors and not with data frames, once you have FirstNameFixed
and LastNFixed
with all operations done, you can combine them:
new_df <- cbind(FirstNameFixed,LastNFixed)
And if you want to substitute them into data frame:
Vaccine_CSV$FirstName <- FirstNameFixed
Vaccine_CSV$LastName <- LastNFixed