MapD slow on text fields

In benchmarking the following query:

SELECT * from test order by c0 asc limit 1

Out of ten tests we get:

Took: 7877 ms. (7.877 s)
Took: 15617 ms. (15.617 s)
Took: 8067 ms. (8.067 s)
Took: 15924 ms. (15.924 s)
Took: 8057 ms. (8.057 s)
Took: 15864 ms. (15.864 s)
Took: 15455 ms. (15.455 s)
Took: 15245 ms. (15.245 s)
Took: 7857 ms. (7.857 s)
Took: 15624 ms. (15.624 s)

c0 is a text field. Why is this query so slow? And is there a way to fix this or do a work-around so that a query -- that would take 0.1s in any other database -- works acceptably?

Solution

How many rows in table? What is the cardinality of c0. Is it possible to get access to your benchmark details and data.

MapD has no indexes so with the sql expressed like this the sort is fully executed for the c0 column and we have not optimized for this particular style of query plan.

To get a more performant result for your query try rewriting it as follows

select * from test where c0 in (select c0 from test group by c0 order by c0 asc limit 1) limit 1;

A little long winded but I hope you will find better performance.

Using the online 400m Twitter dataset the following query comes back in around 100ms

select * from tweets_new where county_state in (select county_state from tweets_new group by county_state order by county_state asc limit 1) limit 1;