sql arrays database data-access-layer data-access

Why is queried data returned in a result set and not in an array?

Why do most programming languages use the concept of result set when returning data from a database? Why aren't the results returned directly in a more common, immediately usable structure like an array? Why is this extra layer between querying and being able to use the results better or necessary?

Solution

An array is just a container of data. A result set is a much more powerful abstraction that encapsulates a very complex interaction between the database server and the client program making the data retrieval request.

"Immediately usable"... that's very naive. Yes, of course, often you just want the data, and often everything goes well and a result set object may seem a bit of a hindrance. But you should stop a moment and think of the complexity that's behind that data retrieval you are executing.

Data fetching

The first and most important consideration is that an array is a static structure that contains all data of all rows. While that might seem like a good solution for small queries, I assure you it is not in most cases. It assumes that fetching all data will require little time and memory, which is not always the case.

RDBMS return one row at a time... that's how things work usually. That way they can serve many clients... you can also cancel your data retrieval... or the RDBMS can take you down if you are hogging too many resources.

The result set handles the complexity of fetching one row or a page of rows or all the rows from the back-end, maybe caching the result internally. It does then allow the program access to just one row of data at a time, adding methods to navigate back and forth, without having to think what is happening behind the scenes. That is not for you to know usually, but there are many optimizations and gotchas.

Unidirectional queries

Some queries on some RDBMS are more efficient if executed unidirectionally. That is you tell the server you will never need to lookup a row of data you have already fetched. But result set objects can often cache this data internally and allow the program to navigate back to it (without disturbing the server).

Updatable queries

Some RDBMS support SELECT FOR UPDATE. Result set objects can often allow the program to modify the fetched data and then handle internally all the operations necessary to reflect those updates on the underlying database... and in many languages this is possible even if the RDBMS does not support SELECT FOR UPDATE.

Better handling of exceptions

When you ask for data, if things go well you get a stream of data that can fit in an array... if things go wrong, you get a stream of information that requires a different structure to be handled. A result set object can provide the client program with that structured information... and can maybe also provide a way of recovering.

I'm adding some more info on cursors, even though it is less relevant to this question. Fetching rows from the server is done through the use of a CURSOR. It tipically envolves 4 steps (DECLARE the cursor, OPEN it, use it to FETCH data, then CLOSE it). Declaring and opening a CURSOR allocates resources on the server which are used to remember what that specific client is asking for and what data has already been returned. FETCHing allows to navigate the result set and retrieve another row of data (not necessarily the next row). Closing the cursor tells the server you are done with that request and allows it to deallocate those resources.