Search code examples
mysqlsqlperformancepostgresqldatabase-performance

How does views calculated columns impact performance?


I understand from this question that SQL language does support calculated columns in views.

I have a requirement where I have a table with multiple columns, and I need to calculate a sorting column in order to simplify my queries. I am thinking of creating a view for my origin table with those sorting columns calculated. But I am afraid that could be a performance nightmare as my table grows bigger.

Does any one have an idea on how that would affect performance?

Is it possible to create index on a calculated column in a view ?


UPDATE 1: I am planning on using postgresql, but I am open to other opensource alternatives like MySQL


UPDATE 2: as N.B. suggested:

I'm not a Postgres user, but the docs here are showing how to create that view and how to index it. If you're using Postgres and are familiar with it - stick with it. All databases work nearly the same, but if you're more proficient with one - no reason to change it. As for how it affects the performance - be it a view or a query that you construct dynamically - it's the same thing. View is just a huge help when querying, and if you can index it it means some memory will be spent on index. You have to measure

I am thinking now that materialized views are the way to go for my functional requirements, I can setup a trigger to refresh the Materialized View on each and every update on my table once I confirm this point:

  • How does REFRESH MATERIALIZED VIEW work ? does it drop the data and recreate the view from scratch ? or does it do some kind of differential refresh ?

Solution

  • Disclaimer: I have used both MySQL and PostgreSQL Database on a remote server for about 8 months only, and I have a preference for PostgreSQL for your use case.

    TL;DR

    1. According to the documentation, REFRESH MATERIALIZED VIEW command will drop all data and re-populate the entire query's data if you add the WITH DATA clause.
    2. You can create indexes for materialized view. The index could be on the calculated fields that are stored in the columns.
    3. You cannot index a view (non-materialized)
    4. You can create different types of materialized views depending on your needs (see URL link below).

    Long Explanation

    A) Materialized Views types and performance

    I have a requirement where I have a table with multiple columns, and I need to calculate a sorting column in order to simplify my queries. I am thinking of creating a view for my origin table with those sorting columns calculated. But I am afraid that could be a performance nightmare as my table grows bigger.


    If the calculations are very expensive, consider consuming more memory to store the results in materialized views or tables.

    A materialized view is like a table that stores the result of a query. In the case of PostgreSQL materialized view, indexes can be created on it to speed up queries and it can be vacuumed to update the meta-data.

    The materialized view that PostgreSQL provides is a naive one because you must manually refresh the data with REFRESH MATERIALIZED VIEW command. According to the documentation, this will drop all data and re-populate the data if you add the WITH DATA clause.

    After that, you need to consider the performance needed for insert, update, delete operations:

    1. If you have no real-time requirements (i.e. a full table re-population is acceptable) then this option is fine.
    2. Else, you might want to see this website post for different setup of materialized views, some of which allows for lazy refresh of data (trigger refresh data by rows) https://hashrocket.com/blog/posts/materialized-view-strategies-using-postgresql


    The second point also applies to MySQL as well (and is actually the traditional and customized way of building materialized views). To my knowledge, MySQL does not support materialized views out-of-the-box (require plugins). The convenience provided in (1) is one of the reasons why I chose PostgreSQL.

    Is it possible to create index on a calculated column in a view ?

    It is possible to index the columns of a materialized view, just as you do for a table.

    B) Window functions in PostgreSQL

    The second reason for choosing PostgreSQL over MySQL is because the former provides extended-SQL functions (or I would like to call them OLAP functions) that help with complex queries like ranking of rows and so on.

    I shall leave it to you to explore this option (just do a Google Search on "PostgreSQL Window Functions").

    According to my latest knowledge, MySQL has no built in support for this (maybe rely on plugins or own coding?).