Search code examples
sqldatabaseoracle-databasesearchoracle12c

SQL based full text search for given args within group of rows


I'm trying to search for specific data in the database table (Oracle 12c). I want to search for specific texts in row groups. Each group have specific ID, so I would like to get ID of the group if all of the searching arguments can be found.

I prepared sample table but with some simplifications: - In real table there is more than 20 columns and millions of rows. - I converted real values to some shorter version like a or b, in real table there are VARCHAR(500) columns - There can be thousands of rows in the same group (same ID) - The search have to be fast, so manipulating too much of this data or many nested queries might not be an option

Sample Table:

+----+----+---+---+----+
| ID | A  | B | C | D  |
+----+----+---+---+----+
|  1 | aq | a | a | a  |
|  1 | a  | a | c | ad |
|  1 | a  | a | a | a  |
|  2 | a  | a | a | a  |
|  2 | a  | a | a | a  |
|  2 | a  | a | a | a  |
|  3 | a  | a | a | a  |
|  3 | a  | a | a | a  |
|  3 | a  | d | a | a  |
+----+----+---+---+----+

Sample Cases:

+------+-------------+-----------+
| Case |  Searching  |  Expected |
+------+-------------+-----------+
|    1 | `q` and `c` | [1]       |
|    2 | `a` and `d` | [1, 3]    |
|    3 | `a` and `q` | [1]       |
|    4 | `a`         | [1, 2, 3] |
+------+-------------+-----------+

Case 1: ID = 1 - matching q and c in two rows Result = Row [1]

+----+----+---+---+----+
| ID | A  | B | C | D  |
+----+----+---+---+----+
|  1 | aq | a | a | a  | <-- q
|  1 | a  | a | c | ad | <-- c
|  1 | a  | a | a | a  |
|  2 | a  | a | a | a  |
|  2 | a  | a | a | a  |
|  2 | a  | a | a | a  |
|  3 | a  | a | a | a  |
|  3 | a  | a | a | a  |
|  3 | a  | d | a | a  |
+----+----+---+---+----+

Case 2: ID = 2 - doesn't have d anywhere Result: Rows [1, 3]

+----+----+---+---+----+
| ID | A  | B | C | D  |
+----+----+---+---+----+
|  1 | aq | a | a | a  | <-- a
|  1 | a  | a | c | ad | <-- a, d
|  1 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  3 | a  | a | a | a  | <-- a
|  3 | a  | a | a | a  | <-- a
|  3 | a  | d | a | a  | <-- a, d
+----+----+---+---+----+

Case 3: ID = 1, matching q and c in single row Result: Row [1]

+----+----+---+---+----+
| ID | A  | B | C | D  |
+----+----+---+---+----+
|  1 | aq | a | a | a  | <-- a, q
|  1 | a  | a | c | ad | <-- a
|  1 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  3 | a  | a | a | a  | <-- a
|  3 | a  | a | a | a  | <-- a
|  3 | a  | d | a | a  | <-- a
+----+----+---+---+----+

Case 4: We have a everywhere Result: Rows [1, 2, 3]

+----+----+---+---+----+
| ID | A  | B | C | D  |
+----+----+---+---+----+
|  1 | aq | a | a | a  | <-- a
|  1 | a  | a | c | ad | <-- a
|  1 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  2 | a  | a | a | a  | <-- a
|  3 | a  | a | a | a  | <-- a
|  3 | a  | a | a | a  | <-- a
|  3 | a  | d | a | a  | <-- a
+----+----+---+---+----+

Any help appreciated :), thanks


Solution

  • You can try the following code:

    SELECT
        ID
    FROM
        (
            SELECT
                ID,
                RTRIM(XMLAGG(XMLELEMENT(E, A || B || C || D, ',').EXTRACT('//text()')).GETCLOBVAL(), ',') 
                AS CONSOLIDATED_VALUE
            FROM
                T
            GROUP BY
                ID
        )
    WHERE
        CONSOLIDATED_VALUE LIKE '%q%'
        AND CONSOLIDATED_VALUE LIKE '%c%'
    

    Demo

    Cheers!!