I have data that I had to export from Airtable into a CSV file. There are some columns where multiple pieces of information are included, separated by commas. However, in some instances the separate pieces of information also include a comma, so that information is contained within quotes, like this:
df <- data.frame(str=c("A,B,C","D,E",'H,"I,J"'))
I'm trying to separate the pieces, by comma, into separate columns...but for row 3 I need "I, J" to stay together in the 2nd separated column (ideally, I'd like to remove the quotes after the split). However, when I use the following these two pieces are in separate columns:
df_splt <- df %>% separate_wider_delim(str, delim = ",", names = c(paste0("str_", seq_len(max(str_count(df$str, ",") + 1)))), too_few="align_start")
I see there is a separate_wider_regex, but I'm unclear how that would work given I have varying numbers of commas within str.
You can use read.csv()
which will only parse the commas that are not inside quotes, i.e.
setNames(read.csv(text = df$str, header = FALSE, stringsAsFactors = FALSE),
paste0("str_", seq_len(max(stringr::str_count(df$str, ",") + 1))))
str_1 str_2 str_3
1 A B C
2 D E
3 H I,J