Search code examples
postgresqlwso2wso2-dasapi-manager

WSO2 DAS Event insertion error (duplicate key value violates unique constraint) with PostgreSQL


We have encountered an error in the DAS (WSO2) using a PostgreSQL database. For some reason that we do not know is filling the PostgreSQL log with some traces:

< 2017-02-23 14:04:52.276 CET >ERROR: relation "anx__8gohuwju" does not exist at character 15
< 2017-02-23 14:04:52.276 CET >STATEMENT: SELECT 1 FROM ANX__8GOHUwJU LIMIT 1
< 2017-02-23 14:04:52.277 CET >ERROR: relation "anx__8gaxmlni" does not exist at character 15
< 2017-02-23 14:04:52.277 CET >STATEMENT: SELECT 1 FROM ANX__8GAXMlnI LIMIT 1
< 2017-02-23 14:04:52.278 CET >ERROR: relation "anx__8gicci_y" does not exist at character 15
< 2017-02-23 14:04:52.278 CET >STATEMENT: SELECT 1 FROM ANX__8GICCI_Y LIMIT 1
...
...
...
< 2017-02-23 14:04:57.335 CET >STATEMENT: INSERT INTO ANX__7LvafTvw (partition_key, timestamp, data, record_id) VALUES ($1, $2, $3, $4)
< 2017-02-23 14:04:57.337 CET >ERROR: duplicate key value violates unique constraint "anx__7lvaftvw_pkey"
< 2017-02-23 14:04:57.337 CET >DETAIL: Key (record_id)=(770425a8-b65b-37ad-ad13-6bd45014ef9a) already exists.
< 2017-02-23 14:04:57.337 CET >STATEMENT: INSERT INTO ANX__7LvafTvw (partition_key, timestamp, data, record_id) VALUES ($1, $2, $3, $4)
< 2017-02-23 14:04:57.431 CET >ERROR: duplicate key value violates unique constraint "anx__7lvaftvw_pkey"
< 2017-02-23 14:04:57.431 CET >DETAIL: Key (record_id)=(da6cfb07-4aff-3bb2-9c71-8f3e6d749762) already exists.
< 2017-02-23 14:04:57.431 CET >STATEMENT: INSERT INTO ANX__7LvafTvw (partition_key, timestamp, data, record_id) VALUES ($1, $2, $3, $4)
< 2017-02-23 14:04:57.433 CET >ERROR: duplicate key value violates unique constraint "anx__7lvaftvw_pkey"
< 2017-02-23 14:04:57.433 CET >DETAIL: Key (record_id)=(da6cfb07-4aff-3bb2-9c71-8f3e6d749762) already exists.
< 2017-02-23 14:04:57.433 CET >STATEMENT: INSERT INTO ANX__7LvafTvw (partition_key, timestamp, data, record_id) VALUES ($1, $2, $3, $4)
< 2017-02-23 14:04:57.776 CET >ERROR: duplicate key value violates unique constraint "anx__7lvaftvw_pkey"
< 2017-02-23 14:04:57.776 CET >DETAIL: Key (record_id)=(c7c32813-51c6-34dd-8a34-e76add9839b6) already exists.
< 2017-02-23 14:04:57.776 CET >STATEMENT: INSERT INTO ANX__7LvafTvw (partition_key, timestamp, data, record_id) VALUES ($1, $2, $3, $4)*

We don't know because it's occurring and if it's a bug or compatibility failure between DAS and PostgreSQL. The traces of error that appear in the log are repeated every 5 seconds without stopping filling the logs of the database and causing the system stop working.

As you can see there are two types of errors:

  • "duplicate key value violates unique constraint XXX"
  • "relation XXX does not exist at character 15"

We have tested it with several instances of DAS and different PostgresSQL and the result is the same. what's going on?

I have configured the DAS to use a PostgreSQL database for both configuration (master-datasources.xml) and events (analytics-datasources.xml). The tests we have done have been using a PostgreSQL 9.3 and 9.6 with the same result.


Solution

  • Despite the alarming nature of the logs you're seeing, there is no unexpected situation occurring between DAS and PostgreSQL as far as I could tell.

    Both these logs are expected since for data access operations, DAS does a check to see if the table exists before committing. This usually produces an error on the database-side if the table did not exist. This is normal.

    In addition, INSERT OVERWRITE operations on records will be attempted as an INSERT, and then if the record already exists (i.e. unique constraint violation), the record will be updated instead. This is due to the unavailability of a MERGE statement in PostgreSQL prior to version 9.5, and due to the relative performance shortcomings of doing UPDATE calls all of the time.

    As for your solution, uncommenting "recordMergeQuery" is perfectly acceptable, but only if you're using PostgreSQL 9.5+. Otherwise, the write operations will fail. In this occasion, we use the new INSERT ... ON CONFLICT UPDATE syntax offered by Postgres (see [2]).

    If you need any further clarification, please feel free to ask.

    Links:
    [1] https://wiki.postgresql.org/wiki/SQL_MERGE
    [2] https://github.com/wso2/carbon-analytics/blob/v1.3.6/features/analytics-data-connectors/org.wso2.carbon.analytics.datasource.rdbms.server.feature/src/main/resources/conf/analytics/rdbms-config.xml#L152