I have a dataframe in R that has a list of records. I would like to delete all of the matching records from my ODBC datasource.
data_to_delete ##This is my dataframe
delete from RODBC datasource where record_id IN (select record_id
from data_to_delete)
Any ideas on how to achieve this? I am currently using sqlQuery to pull data down and sqlSave to create new records, but I'm not sure how to delete records.
As the answers to this related question explain, you will need to build up the string that holds theDELETE
statement programmatically, and then you can execute a DELETE
statement just as any other statement within sqlQuery
.
So at the end of your DELETE
, don't try to "select" from the R variable, rather just dump its contents into a comma-separated string. Below, I assume that whatever data_to_delete
is, we can use paste
with as.character
to spit it into a comma-separated string. But you can modify that as needed.
I also assume that you mean for datasource
to refer to the database, and I use some_table_name
as a placeholder for the table's name within that database. You should adjust the values for your specific case.
del_query <- sprintf("DELETE from some_table_name
where record_id IN (%s)",
paste(as.character(data_to_delete),
collapse=", ")
)
sqlQuery(datasource, del_query)
Using this method, you need to be careful if the content of your data is large. There is often a symbol limit for the number of distinct symbols that can appear in the IN
clause -- something like 2000 symbols as default in SQL Server.
If you need more values than that, you could first issue a query that writes the values into a temp table wholly on the database, and then issue a query with sqlQuery
that contains the extra IN ( SELECT * FROM ... )
but you'll be querying from that temp table, not "from" the R variable.