Given the following:
CREATE TABLE filters (
id BIGINT NOT NULL
, name TEXT NOT NULL
, range NUMRANGE NOT NULL
, UNIQUE (id, name)
);
INSERT INTO filters (id,name,range) VALUES (99,'value1','[45,90]'::NUMRANGE);
INSERT INTO filters (id,name,range) VALUES (99,'value2','[15,50]'::NUMRANGE);
INSERT INTO filters (id,name,range) VALUES (23,'value1','[45,90]'::NUMRANGE);
CREATE TABLE data (
value1 NUMERIC
, value2 NUMERIC
);
EDIT - Added sample data rows
// filter.id = 99, neither value1 or value2 match this row
INSERT INTO data (value1,value2) VALUES (40.01, 11.12);
// filter.id = 99, only value2 matches this row
INSERT INTO data (value1,value2) VALUES (15.48, 20.14);
// filter.id = 99, only value1 matches this row
INSERT INTO data (value1,value2) VALUES (53.48, 70.14);
// filter.id = 99, both value1 and value2 match this row
INSERT INTO data (value1,value2) VALUES (64.12, 33.48);
The filters table consists of a series of filters, grouped by id.
I want to apply a specific group of filters (i.e. filters.id = 99) to the data table where each row of filters would apply filters.range to the column of data that matched filters.name.
Originally, the schema included a 1:1 column duplication in filters and data, such that a join could be accomplished with filters.col3 = data.col3 or by a ranged comparison. The problem is I know have over 1000 filters I'd like to track in the filters table and apply against the data table.
Tried some googling but not sure how to phrase this type of question. I know I could just form the queries in an external language, but I'd like to execute this in pure SQL/PostgreSQL if possible.
You need dynamic SQL for that.
Add a surrogate PK to your table for several reasons, one of which being a stable sort order for filters within the same group. Another: always add a PK.
CREATE TABLE filter (
filter_id serial PRIMARY KEY -- add surrogate PK
, filter_grp int NOT NULL
, name text NOT NULL
, range numrange NOT NULL
);
This function does it all:
CREATE OR REPLACE FUNCTION f_get_data(_filter_grp int)
RETURNS SETOF data AS
$func$
DECLARE
_where text;
_arr numrange[];
BEGIN
SELECT string_agg(name || ' <@ $1[' || rn || ']', ' AND ') -- AND?
,array_agg(range)
INTO _where, _arr
FROM (
SELECT name, range
,row_number() OVER (ORDER BY filter_id) AS rn
FROM filter
WHERE filter_grp = _filter_grp
ORDER BY filter_id
) sub;
-- RAISE NOTICE '%', 'SELECT * FROM data WHERE ' || _where;
RETURN QUERY EXECUTE 'SELECT * FROM data WHERE ' || _where
USING _arr;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_get_data(99);
This builds and executes a query of the form:
SELECT * FROM data
WHERE value1 <@'[45,90]'::numrange
AND value2 <@'[15,50]'::numrange;
SELECT * FROM data
can conveniently be coupled with SETOF data
as return type of the function.
Avoid converting values to text and back when building the query string. Instead provide values with the USING
clause, and use $n
placeholders in the query sting for EXECUTE
.
In this particular case, the number of values is also dynamic. We can dodge the bullet by using array indices which automatically correspond to the position of the value in the aggregated array.