Search code examples
mysqlspring-bootperformancespring-data-jpaquery-optimization

How to design good search funcionality using CriteriaQuery when the number of columns is high and customized?


Intro

I'm using spring boot (with spring-data-jpa) with Angular as the frontend. Plant table has currently ~100 columns.

What I'm doing, the starting point

The task I'm working on at the moment is to optimize the search in term of performance.

The use case I'm trying to implement

The Angular search result page/component show only 6 columns but the user has option to replace these columns with different 6 columns (customization).

Now, to me it doesn't make sense to allow the user to select from 100 columns so we decided to limit the list to smaller, let's say ~30.

This is the code:

CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Plant> cq = cb.createQuery(Plant.class);
Root<Plant> plant = cq.from(Plant.class);

The user perform search and see the 6 default columns in a table, if he change the columns to others, would you go back to the server or you fetch (select) the 30 in each call.


Solution

  • That can become very subjective question depending on where your analysis point is focused:

    • From a DB perspective: If you are querying 6 or 30 columns on the same table should not make a big difference in performance. However, this can changes if additional columns require joins to additional tables. The issue would be the amount of DB hits (how many times you go to DB) if you choose to go back to the backend during customization as it will increase the time DB is processing something.

    • From Network perspective: Bigger responses will have an impact on response time, data usage, etc. So it can be good to think how often users tend to customize columns, having the 30 columns all the time will consume more bandwidth, it may add a few milliseconds or even seconds depending on how big the additional data can be as well as Network conditions. If he feature is rarely used, I would rather avoid this performance impact (personal opinion), but if it is often used, preloading the data may have a better UX impact.

    • Not performance related, but I would also analyze Security impact. Although the user can customize the columns, giving all the data (30 columns) all the time may be a security concern if there is any information that should not be shared all the time, but it could be OK if it is public-domain or low risk data

    Conclusion

    Since the original question seems to relate to the DB perspective only, I would opt for retrieving all the 30 columns in one go, given that it will lower the DB hits upon column customization without lowering the single query latency (and there is no other restriction enounced in the question).