I am trying to use a modified greenplum open source version for development. The greenplum version is Greenplum Database 6.0.0-beta.1 build dev (based on PostgreSQL 9.4.24).
I wanted to add pg_stat_statements extension to my database, and I did manage to install it on the database, following https://www.postgresql.org/docs/9.5/pgstatstatements.html. However, this extension doesn't work as expected. It only records non-plannable queries and utility queries. For all plannable queries which modify my tables, there is not a single record.
My question is, is pg_stat_statements compatible with greenplum? Since I am not using the official release, I would like to make sure the original one can work with pg_stat_statements. If so, how can I use it to track all sql queries in greeplum? Thanks.
Below is a example of not recording my select query.
postgres=# select pg_stat_statements_reset();
pg_stat_statements_reset
--------------------------
(1 row)
postgres=# select query from pg_stat_statements;
query
------------------------------------
select pg_stat_statements_reset();
(1 row)
postgres=# select * from test;
id | num
----+-----
1 | 2
3 | 4
(2 rows)
postgres=# select query from pg_stat_statements;
query
------------------------------------
select pg_stat_statements_reset();
(1 row)
Here's what I get from @leskin-in in greenplum slack:
I suppose this is currently not possible. When Greenplum executes "normal" queries, each of Greenplum segments acts as an independent PostgreSQL instance which executes a plan created by GPDB master. pg_stat_statements tracks resources usage for a single PostgreSQL instance; as a result, in GPDB it is able to track the resources consumption for each segment independently. There are several complications PostgreSQL
pg_stat_statements
does not deal with. An example is that GPDB uses slices. On GPDB segments, these are independent parts of plan trees and are executed as independent queries. I suppose when a query to pg_stat_statemens is made on GPDB master in current version, the results retrieved are for master only. As "normal" queries are executed in most part by segments, the results are inconsistent with actual resources' consumption by the query. In open-source Greenplum 5 & 6 there is a Greenplum-specific utility gpperfmon. It provides some of the pg_stat_statements features, and is cluster-aware (shows actual resources consumption for the cluster as a whole, and also several cluster-specific metrics).