I have a dataset with key-value pairs that I want to import into R. The keys and values are separated by colons, while the key-value pairs are separated by commas. However, some of the values contain commas or colons, which can cause confusion when importing the data into R. To avoid this issue, I need to replace the commas and colons in the values with a different character before importing the data. For example:
{'AI': 'C3.ai, Inc.', 'BA': 'Boeing Company (The)', 'AAL': 'American Airlines Group, Inc.', 'MA': 'Mastercard :Incorporated'}
to
{'AI': 'C3.ai| Inc.', 'BA': 'Boeing Company (The)', 'AAL': 'American Airlines Group| Inc.', 'MA': 'Mastercard |Incorporated'}
I have tried this:
replacer<- function(x) {
str_replace_all(x, "[,:]", "|")
}
clean_lines <- str_replace_all(lines, "(?<=')[^']*[:.][[:space:]]*[^']*[[:space:]]*[^']*(?=')", replacer)
cat(clean_lines)
which works fine for commas but messes up all colons, here is the result
{'AI': 'C3.ai| Inc.', 'BA': 'Boeing Company (The)', 'AAL': 'American Airlines Group| Inc.','MA': 'Mastercard :Incor| porated'}
how can i edit this code to replace only : within ' '
This is a JSON format, so read it as such. First, to make it a valid format, we need to replace single quotes - '
to double - "
, then read using jsonlite package:
library(jsonlite)
# example file
writeLines("{'AI': 'C3.ai, Inc.', 'BA': 'Boeing Company (The)', 'AAL': 'American Airlines Group, Inc.', 'MA': 'Mastercard :Incorporated'}",
"tmp.txt")
# read from file
x <- readLines("tmp.txt")
x <- gsub("'", "\"", x, fixed = TRUE)
fromJSON(x)
# $AI
# [1] "C3.ai, Inc."
#
# $BA
# [1] "Boeing Company (The)"
#
# $AAL
# [1] "American Airlines Group, Inc."
#
# $MA
# [1] "Mastercard :Incorporated"