I've got a simple dataframe with three columns. One of the columns contains a database name. I first need to check if data exists, and if not, insert it. Otherwise do nothing.
Sample data frame:
clientid;region;database
135;Europe;europedb
2567;Asia;asiadb
23;America;americadb
So I created a function to apply to dataframe this way:
library(RMySQL)
check_if_exist <- function(df){
con <- dbConnect(MySQL(),
user="myuser", password="mypass",
dbname=df$database, host="myhost")
query <- paste0("select count(*) from table where client_id='", df$clientid,"' and region='", df$region ,"'")
rs <- dbSendQuery(con, query)
rs
}
Function call:
df$new_column <- lapply(df, check_if_exist)
But this doesn't work.
This is a working example of what you are asking, if I understood correctly. But I don't have your database, so we just print the query for verification, and fetch a random number as the result.
Note that by doing lapply(df,...)
, you are looping over the columns of the database, and not the rows as you want.
df = read.table(text="clientid;region;database
135;Europe;europedb
2567;Asia;asiadb
23;America;americadb",header=T,sep=";")
check_if_exist <- function(df){
query = paste0("select count(*) from table where client_id='", df$clientid,"' and region='", df$region ,"'")
print(query)
rs <- runif(1,0,1)
return(rs)
}
df$new_column <- sapply(split(df,seq(1,nrow(df))),check_if_exist)
Hope this helps.