Search code examples
sqlnode.jsrdbmsknex.js

RDBMS results return, ordering and returning sets/hashmaps instead of arrays/lists


Most/all SQL based RDBMS connector libraries I've come into contact with will return results in array form. Why? If the order is arbitrary (without a sorting SQL modifier), then could the natural data return be in the form of something like a Set or Hashmap? These data structures would, in some cases, be more computationally favorable at scale than a typical array/list return in languages like C++ (with standard template library usage), JavaScript/Node, Go, and any other language that supports associative data types or pure Sets.

In particular, do libraries such as knex.js offer such a feature in the form of a connection flag (I haven't found it yet)?

Do any of the major RDBMS systems (MySQL, PostgreSQL, ...), offer the ability to return results in a set/hashmap form?

Concretely, what I think would make sense using node.js and a library like knex.js, is to specify a flag like:

knex.forceMap('keycolumnpattern') Or, knex.forceSet()...

Again, the underlying assumption here would be that you are not imposing order on the SQL (or other) query by adding a sort directive i.e. ORDER BY

The justification for this is in environments were scaling and computational complexity are important concerns.


Solution

  • Good question.

    This is by no means a comprehensive answer, but just my opinion on this curious question.

    Usually databases return a series of rows, that most documentation refers to as "result set".

    The database

    Now, this result set is assembled when the query is executed and may take different forms. Most likely the database sends it as an "enumeration": this is, a list-like entity that produces rows when you request them. In order to save resources database will try not to materialize the whole result set at once, but to produce rows as you read them from your client application. Well, this happens as long as the query can be "pipelined".

    When the query cannot be pipelined, then the whole data set (in the database side) is materialised.

    The driver

    You client driver does not retrieve rows one by one but in groups of them by the use of buffering. Even when the query cannot be pipelined your client driver will also retrieve the rows in groups according to the "fetch size" and "buffer size".

    The client technology

    Your application can use basic driver primitive operations, or a more sophisticated ORM. It's common that ORMs will hide all inner workings of the driver and will offer you a "simple" result like an array, list, or map, i.e., hiding the "streaming" an enumeration provides.

    If you don't use an ORM, then you will probably call the driver primitives yourself and therefore you can get access to all inner, ugly details. The upside is that you can assemble the result set rows in any data structure you prefer.

    In any case, the repertoire of data structures will depend on the specific query since a "map" or a "set" will require some king of unique identifier, while a list won't.