Search code examples
postgresqlamazon-web-servicesquery-optimizationamazon-rds

AWS RDS Postgres Performance Slow Data Transfer


I am experiencing slow (2-3min) returns from a simple query on an AWS RDS postgres materialized view. My working hypothesis is that this is a network transfer issue but I am not an expert in postgres. Some background on the setup:

  • "PostgreSQL 10.21 on x86_64-pc-linux-gnu, compiled by x86_64-pc-linux-gnu-gcc (GCC) 7.4.0, 64-bit",
  • Running this in Serverless mode on RDS, 16 Aurora Units
  • This process is run through an AWS VPN Connection
  • The table in question has about 50M rows, 30 columns and is currently 20GB in total size
  • I have placed an multicolumn index on the view (2 columns of interest). One of these "COMPOUND" has ~100K unique values and approx 1000 rows per compound, the other "GENE" has ~500 unique values with ranges 100K-2M rows per gene

When I run:

SELECT * FROM my_materialized_view where "COMPOUND" = 'cpd1234';

The query returns ~3-5seconds.

When I run:

SELECT * FROM my_materialized_view where "GENE" = 'geneXYZ';

The query takes ~2min to return 1M rows. The size of the returned data appears to 100-200MB in size.

Doing something slightly more "exotic":

SELECT * FROM my_materialized_view where "GENE" = 'geneXYZ' ORDER BY 'measured_value' LIMIT 10000; 

returns data in about 12sec.

I also ran:

EXPLAIN ANALYZE SELECT * FROM my_materialized_view where "GENE" = 'geneXYZ';

the output seemed to indicate the process was about 500ms

"Index Scan using my_materialized_view_idx on my_materialized_view (cost=0.56..1415484.25 rows=922393 width=336) (actual time=0.039..437.332 rows=979434 loops=1)" " Index Cond: (""GENE"" = 'geneXYZ'::text)" "Planning time: 0.092 ms" "Execution time: 491.306 ms"

Am I reading this wrong or is the issue with the data transfer/network latency? I am using pgadmin for the queries, but also ran the query via python boto3 and equivalent times are realized. Appreciate your thoughts on this. Thanks!


Solution

  • Two facts about your situation make it clear that you do have slow data transfer, not a slow RDBMS server.

    1. You have an index on that GENE column, and your query plan says postgreSQL is using it.
    2. Your second query, with its ORDER BY some_unindexed_column LIMIT some_number, burdens postgreSQL with a sort. It returns about 1% of the data of your first query, and completes ten times faster. That means the sort costs something, but the transfer also costs something.