We have an external PostgreSQL server where we pull data from using odbc. We typically pull data in chunks as we need, so sometimes pull all date "where date > (some specific date)" or "where id > (some specified id)" or between certain dates (like start_date and end_date). Example code below
The above conditions are pretty straight forward and can be implemented without burdening the server too much.
But it's a different story when we need to pull the entire server data (too much data on the server). Is there a way to pull data from the server in batches so an to not overburden the server?
Below is an example that we use to
getdata:{[]
query: "select data.id, data.first_name, data.last_name, data.email, data.created_at from data where data.created_at > '2020-02-04' order by id asc" ];
us::.odbc.open `dbs;
leads::.odbc.eval[us; query];
.odbc.close us;
};
dbs = server name
Could you create date batches on the kdb side and generate a startDate-endDate query for each? e.g. create 5 years of start/end dates starting at 2002.01.01
q){(`date$s),'-1+`date$12+s:(12*til y)+`month$x}[2002.01.01;5]
2002.01.01 2002.12.31
2003.01.01 2003.12.31
2004.01.01 2004.12.31
2005.01.01 2005.12.31
2006.01.01 2006.12.31
I guess you would have to loop over each table on the server also.