Search code examples
postgresqlpsqlproceduretimescaledb

PostgreSQL: Segmentation Fault when calling procedure with temporary tables


I apologize for the lack of a better title. I'm at a loss as to what the exact problem could be.

PostgreSQL 13.4
TimescaleDB 2.4.2
Ubuntu 18.04.6 LTS Running in Docker (Dockerfile further down)
shared_memory 16GB
shm size 2GB

The query at the bottom causes postgres to shut down with the error:

 2022-05-09 15:17:42.012 UTC [1] LOG:  server process (PID 1316) was terminated by signal 11: Segmentation fault
 2022-05-09 15:17:42.012 UTC [1] DETAIL:  Failed process was running: CALL process_wifi_traffic_for_range('2022-01-10','2022-01-12')
 2022-05-09 15:17:42.012 UTC [1] LOG:  terminating any other active server processes
 2022-05-09 15:17:42.013 UTC [654] WARNING:  terminating connection because of crash of another server process
 2022-05-09 15:17:42.013 UTC [654] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
 2022-05-09 15:17:42.013 UTC [654] HINT:  In a moment you should be able to reconnect to the database and repeat your command.

I want to process traffic data day by day, since my data lake is very big. First I load the day into the temporary table unprocessed. Then I extract the difference between the traffic reported the record before and now since the source only gives the total traffic accrued. The tables wifi_traffic_last_traffic, t_last_traffic are just to keep track of the last known traffic per client form the last time the procedure was run. ht_wifi_traffic_processed is a timescaledb hypertable, but the error also occurs when I use a normal table. It also does not help to reduce the timeframe to 1 hour instead of one day in case of memory issues. Sometimes it manages to do 1 or 2 days and the data that manages to finish is correct.

The query is a bit long, but I don't want to omit anything:

DECLARE
    f_date date = from_date::date;
    t_date date = to_date::date;
BEGIN
    SET SESSION temp_buffers = '1GB';
    
    CREATE TEMPORARY TABLE t_last_traffic (
        clientid text,
        devicecpuid text,
        traffic bigint,
        PRIMARY KEY(clientid,devicecpuid)
    );
    
    INSERT INTO t_last_traffic (
        SELECT * FROM wifi_traffic_last_traffic
    );
    
    CREATE TEMPORARY TABLE unprocessed (
        "timestamp" timestamp,
        clientid text,
        devicecpuid text,
        customer text,
        traffic bigint
    ) ON COMMIT DELETE ROWS;
    
    CREATE TEMPORARY TABLE processed (
        "timestamp" timestamp,
        clientid text,
        devicecpuid text,
        customer text,
        traffic bigint,
        PRIMARY KEY("timestamp", clientid, devicecpuid)
    ) ON COMMIT DELETE ROWS;
    
    LOOP
        RAISE NOTICE 'Processing date: %', f_date;
    
        INSERT INTO unprocessed
        SELECT wt."timestamp", wt.clientid, wt.devicecpuid, wt.customer, wt.traffic 
        FROM wifi_traffic AS wt
        WHERE wt."timestamp"
              BETWEEN
                      f_date::timestamp
                  AND
                      f_date::timestamp + INTERVAL '1 day'
        ORDER BY
            devicecpuid ASC, --Important to sort by cpuID first as to not mix traffic results.
            clientid ASC,
            wt."timestamp" ASC;
        
        RAISE NOTICE 'Unprocessed import done';
        
        INSERT INTO processed
            SELECT * FROM (
                SELECT
                    up."timestamp",
                    up.clientid,
                    up.devicecpuid,
                    up.customer,
                    wifi_traffic_lag(
                        up.traffic, 
                        lag(
                            up.traffic, 
                            1, 
                            COALESCE(
                                (
                                    SELECT lt.traffic
                                    FROM t_last_traffic lt
                                    WHERE 
                                        lt.clientid = up.clientid
                                    AND
                                        lt.devicecpuid = up.devicecpuid
                                    FETCH FIRST ROW ONLY
                                ),
                                CAST(0 AS bigint)
                            )
                        )   
                        OVER (
                            PARTITION BY 
                                up.clientid, 
                                up.devicecpuid 
                            ORDER BY 
                                up.clientid, 
                                up.devicecpuid, 
                                up."timestamp"
                        )
                    ) AS traffic
                FROM unprocessed up
                WHERE
                    up.traffic != 0
            ) filtered
            WHERE
                filtered.traffic > 0
        ON CONFLICT ON CONSTRAINT processed_pkey DO NOTHING;
        
            
        RAISE NOTICE 'Processed import done';
            
            
        INSERT INTO t_last_traffic(devicecpuid, clientid, traffic)
            SELECT up.devicecpuid, up.clientid, MAX(up.traffic)
            FROM unprocessed up
            GROUP BY up.devicecpuid, up.clientid
        ON CONFLICT ON CONSTRAINT t_last_traffic_pkey DO UPDATE SET
            traffic = EXCLUDED.traffic;
            
        INSERT INTO ht_wifi_traffic_processed
        SELECT * FROM processed;
        
        TRUNCATE TABLE unprocessed;
        TRUNCATE TABLE processed;
        
        COMMIT;
        
        RAISE NOTICE 'Finished processing for date: %', f_date;
        
        f_date = f_date + 1;
        EXIT WHEN f_date > t_date;
    END LOOP;
    
    INSERT INTO wifi_traffic_last_traffic
    SELECT * FROM t_last_traffic
    ON CONFLICT ON CONSTRAINT wifi_traffic_last_traffic_pkey DO UPDATE SET
        traffic = EXCLUDED.traffic;
        
    DROP TABLE t_last_traffic;
    DROP TABLE unprocessed;
    DROP TABLE processed;
    
    COMMIT;
END

Docker Compose:

services:
  postgres-storage:
    image: <redacted>/postgres_gis_tdb:pg13_tdb2.4.2_gis3.1.4
    restart: unless-stopped
    shm_size: 2gb
    ports:
      - '5433:5432'
    networks:
      - bigdata
    volumes:
      - /mnt/data_storage/psql_data:/var/lib/postgresql/data
      - /usr/docker-volumes/postgres-storage:/var/lib/postgresql/ssd
    environment:
      POSTGRES_USER: postgres
    env_file:
      - .env

Dockerfile:

FROM postgres:13

ENV POSTGIS_MAJOR 3
ENV POSTGIS_VERSION 3.1.4+dfsg-1.pgdg100+1


### INSTALL POSTGIS ###

RUN apt-get update \
      && apt-cache showpkg postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR \
      && apt-get install -y --no-install-recommends \
           postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR=$POSTGIS_VERSION \
           postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR-scripts \
      && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /docker-entrypoint-initdb.d
COPY ./initdb-postgis.sh /docker-entrypoint-initdb.d/10_postgis.sh
COPY ./update-postgis.sh /usr/local/bin


###         INSTALL TIMESCALEDB           ###
# Important: Run timescaledb-tune           #
# once for COMPLETELY NEW DATABASES,        #
# so no existing postgresql_data.           #
###                                       ###
RUN apt-get update \
      && apt-get install -y postgresql-common wget lsb-release
RUN echo "yes" | sh /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
RUN sh -c "echo 'deb [signed-by=/usr/share/keyrings/timescale.keyring] https://packagecloud.io/timescale/timescaledb/debian/ $(ls$
RUN wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey |  gpg --dearmor -o /usr/share/keyrings/timescale.keyr$
RUN apt-get update \
      && apt-get install -y timescaledb-2-postgresql-13 timescaledb-tools

Solution

  • It seems like the error comes from doing

    CREATE TEMPORARY TABLE unprocessed (
            "timestamp" timestamp,
            clientid text,
            devicecpuid text,
            customer text,
            traffic bigint
        ) ON COMMIT DELETE ROWS; 
    

    As well as:

    TRUNCATE TABLE unprocessed;
    

    I did this initially because a test indicated the the ON COMMIT DELETE ROWS wasn't really clearing the table after the COMMIT in the middle of the procedure. Leaving it out prevented the error from occuring and further tests showed that even without it the data was as expected. It seems to be some sort of race condition. I will post this in the postgres github as well.