Search code examples
sqlsql-servermongodbmonetdbdatabase

Database recommendation


I have a collection of data that looks as follows:

id   name     c1    c2    c3    c4   ...  c50
-----------------------------------------------
1    string1  0.1   0.32  0.54 -1.2  ...  2.3
2    string2  0.12  0.12 -0.34  2.45 ...  1.3
...
(millions of records)

So I have an id column, a string column, then 50 floating point columns.

There will be only one type of query run on this data that in a traditional SQL SELECT statement would look like this:

SELECT name FROM table WHERE ((a1-c1)+(a2-c2)+(a3-c3)+...+(a50-c50)) > 1; where a1,a2,a3,etc are values that are generated before the query is sent (not housed in the data table).

My question is this: Does anyone have any recommendations as to what type of database would handle this type of query the fastest. I have used SQL server (which is majorly slow), so I am looking for other opinions.

Would there be a way to optimize SQL server for this type of query? I have also been curious about column store databases such as MonetDB. Or perhaps a document store database such as MongoDB. Does anyone have any suggestions?

Many thanks, Brett


Solution

  • You can continue using SQL Server and use a persisted computed column that calculates the sum of all the values and index that.

    ALTER TABLE tablename ADD SumOfAllColumns AS (c1 + c2 + ... + c50) PERSISTED
    

    Then you can rearrange your query as:

    SELECT name FROM tablename WHERE SumOfAllColumns < a1+a2+a3+...+a50 - 1
    

    This query will be able to use the index on the computed column and should find the relevant rows quickly.