I have a table in MYSQL that contains the user interactions with a Web Page, I needed to extract the rows for the users where the date of that interaction is lower than a certain benchmark date and that benchmark date is different for each customer (I extract that date from a different database).
My approach was to set a json variable in which the key is a user and the value is the benchmark date, and used it in the query to extract the intended fields.
Example in R:
#MainDF contains the user and the benchmark date from a different database
json_str <- mapply(function(uid, bench_date){
paste0(
'{','"',cust,'"', ':', '"', bench_date, '"','}'
)
}, MainDF[, 'uid'],
MainDF[, 'date']
)
json_str <- paste0("'", '[', paste0(json_str , collapse = ','), ']', "'")
temp_var <- paste('set @test=', json_str)
The intention was to make temp_var to be like:
set @test= '{"0001":"2010-05-05",
"0012":"2015-05-05",
"0101":"2018-07-20"}'
but it actually looks like :
set @test= '{\"0001\":\"2010-05-05\",
\"0012\":\"2015-05-05\",
\"0101\":\"2018-07-20\"}'
then create the main query:
main_Q <- "select user_id, date
from interaction
where 1=1
and json_contains(json_keys(@test), concat('\"',user_id,'\"')) = 1
and date <= json_unquote(json_extract(@test,
concat('$.','\"',user_id, '\"')
)
)
"
For the execution, first, set the temporal variable and then execute the main query
dbSendQuery(connection, temp_var)
resp <- dbSendQuery(connection, main_Q )
target_df <- fetch(resp, n=-1)
dbClearResult(resp )
When I test a fraction of it in a SQL IDE it does works. However, in R it doesn't return anything. I think that the issue is that R escape the double quotes in temp_var and SQL end up reading
set @test= '{\"0001\":\"2010-05-05\",
\"0012\":\"2015-05-05\",
\"0101\":\"2018-07-20\"}'
which is not won't work. For example if I execute:
set @test= '{"0001":"2010-05-05",
"0012":"2015-05-05",
"0101":"2018-07-20"}'
select json_keys(@test)
it will return an array with the keys, but that is not the case with
set @test= '{\"0001\":\"2010-05-05\",
\"0012\":\"2015-05-05\",
\"0101\":\"2018-07-20\"}'
select json_keys(@test)
I am not sure how to solve the issue, but I need double quotes to specify the JSON. Is there any other approach that I should try or a way to make this work?
First, I think it is generally better to use a well-known library/package for converting to/from JSON, for several reasons.
This gives you a string that you should be able to place just about anywhere.
json_str <- jsonlite::toJSON(setNames(as.list(MainDF$date), MainDF$uid), auto_unbox=TRUE)
json_str
# {"0001":"2010-05-05","0012":"2015-05-05","0101":"2018-07-20"}
And while looking at the object on the R console will give the escaped-doublequotes,
as.character(json_str)
# [1] "{\"0001\":\"2010-05-05\",\"0012\":\"2015-05-05\",\"0101\":\"2018-07-20\"}"
that is merely R's representation (shows all strings within double-quotes, and therefore needs to escape any double-quotes within the string).
Adding it into some script should be straight-forward:
cat(paste('set @test=', sQuote(json_str)), '\n')
# set @test= '{"0001":"2010-05-05","0012":"2015-05-05","0101":"2018-07-20"}'
I'm assuming that having each on its own row is not critical. If it is, and indentation is important, perhaps this is more your style:
spaces <- strrep(' ', 2+nchar('set @test = '))
cat(paste0('set @test = ', sQuote(gsub(",", paste0(",\n", spaces), json_str))), '\n')
# set @test = '{"0001":"2010-05-05",
# "0012":"2015-05-05",
# "0101":"2018-07-20"}'
Data:
MainDF <- read.csv(stringsAsFactors=FALSE, colClasses='character', text='
uid,date
0001,2010-05-05
0012,2015-05-05
0101,2018-07-20')