I apologize for the lack of a better title. I'm at a loss as to what the exact problem could be.
PostgreSQL 13.4
TimescaleDB 2.4.2
Ubuntu 18.04.6 LTS
Running in Docker (Dockerfile further down)
shared_memory 16GB
shm size 2GB
The query at the bottom causes postgres to shut down with the error:
2022-05-09 15:17:42.012 UTC [1] LOG: server process (PID 1316) was terminated by signal 11: Segmentation fault
2022-05-09 15:17:42.012 UTC [1] DETAIL: Failed process was running: CALL process_wifi_traffic_for_range('2022-01-10','2022-01-12')
2022-05-09 15:17:42.012 UTC [1] LOG: terminating any other active server processes
2022-05-09 15:17:42.013 UTC [654] WARNING: terminating connection because of crash of another server process
2022-05-09 15:17:42.013 UTC [654] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2022-05-09 15:17:42.013 UTC [654] HINT: In a moment you should be able to reconnect to the database and repeat your command.
I want to process traffic data day by day, since my data lake is very big. First I load the day into the temporary table unprocessed. Then I extract the difference between the traffic reported the record before and now since the source only gives the total traffic accrued. The tables wifi_traffic_last_traffic, t_last_traffic are just to keep track of the last known traffic per client form the last time the procedure was run. ht_wifi_traffic_processed is a timescaledb hypertable, but the error also occurs when I use a normal table. It also does not help to reduce the timeframe to 1 hour instead of one day in case of memory issues. Sometimes it manages to do 1 or 2 days and the data that manages to finish is correct.
The query is a bit long, but I don't want to omit anything:
DECLARE
f_date date = from_date::date;
t_date date = to_date::date;
BEGIN
SET SESSION temp_buffers = '1GB';
CREATE TEMPORARY TABLE t_last_traffic (
clientid text,
devicecpuid text,
traffic bigint,
PRIMARY KEY(clientid,devicecpuid)
);
INSERT INTO t_last_traffic (
SELECT * FROM wifi_traffic_last_traffic
);
CREATE TEMPORARY TABLE unprocessed (
"timestamp" timestamp,
clientid text,
devicecpuid text,
customer text,
traffic bigint
) ON COMMIT DELETE ROWS;
CREATE TEMPORARY TABLE processed (
"timestamp" timestamp,
clientid text,
devicecpuid text,
customer text,
traffic bigint,
PRIMARY KEY("timestamp", clientid, devicecpuid)
) ON COMMIT DELETE ROWS;
LOOP
RAISE NOTICE 'Processing date: %', f_date;
INSERT INTO unprocessed
SELECT wt."timestamp", wt.clientid, wt.devicecpuid, wt.customer, wt.traffic
FROM wifi_traffic AS wt
WHERE wt."timestamp"
BETWEEN
f_date::timestamp
AND
f_date::timestamp + INTERVAL '1 day'
ORDER BY
devicecpuid ASC, --Important to sort by cpuID first as to not mix traffic results.
clientid ASC,
wt."timestamp" ASC;
RAISE NOTICE 'Unprocessed import done';
INSERT INTO processed
SELECT * FROM (
SELECT
up."timestamp",
up.clientid,
up.devicecpuid,
up.customer,
wifi_traffic_lag(
up.traffic,
lag(
up.traffic,
1,
COALESCE(
(
SELECT lt.traffic
FROM t_last_traffic lt
WHERE
lt.clientid = up.clientid
AND
lt.devicecpuid = up.devicecpuid
FETCH FIRST ROW ONLY
),
CAST(0 AS bigint)
)
)
OVER (
PARTITION BY
up.clientid,
up.devicecpuid
ORDER BY
up.clientid,
up.devicecpuid,
up."timestamp"
)
) AS traffic
FROM unprocessed up
WHERE
up.traffic != 0
) filtered
WHERE
filtered.traffic > 0
ON CONFLICT ON CONSTRAINT processed_pkey DO NOTHING;
RAISE NOTICE 'Processed import done';
INSERT INTO t_last_traffic(devicecpuid, clientid, traffic)
SELECT up.devicecpuid, up.clientid, MAX(up.traffic)
FROM unprocessed up
GROUP BY up.devicecpuid, up.clientid
ON CONFLICT ON CONSTRAINT t_last_traffic_pkey DO UPDATE SET
traffic = EXCLUDED.traffic;
INSERT INTO ht_wifi_traffic_processed
SELECT * FROM processed;
TRUNCATE TABLE unprocessed;
TRUNCATE TABLE processed;
COMMIT;
RAISE NOTICE 'Finished processing for date: %', f_date;
f_date = f_date + 1;
EXIT WHEN f_date > t_date;
END LOOP;
INSERT INTO wifi_traffic_last_traffic
SELECT * FROM t_last_traffic
ON CONFLICT ON CONSTRAINT wifi_traffic_last_traffic_pkey DO UPDATE SET
traffic = EXCLUDED.traffic;
DROP TABLE t_last_traffic;
DROP TABLE unprocessed;
DROP TABLE processed;
COMMIT;
END
Docker Compose:
services:
postgres-storage:
image: <redacted>/postgres_gis_tdb:pg13_tdb2.4.2_gis3.1.4
restart: unless-stopped
shm_size: 2gb
ports:
- '5433:5432'
networks:
- bigdata
volumes:
- /mnt/data_storage/psql_data:/var/lib/postgresql/data
- /usr/docker-volumes/postgres-storage:/var/lib/postgresql/ssd
environment:
POSTGRES_USER: postgres
env_file:
- .env
Dockerfile:
FROM postgres:13
ENV POSTGIS_MAJOR 3
ENV POSTGIS_VERSION 3.1.4+dfsg-1.pgdg100+1
### INSTALL POSTGIS ###
RUN apt-get update \
&& apt-cache showpkg postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR \
&& apt-get install -y --no-install-recommends \
postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR=$POSTGIS_VERSION \
postgresql-$PG_MAJOR-postgis-$POSTGIS_MAJOR-scripts \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir -p /docker-entrypoint-initdb.d
COPY ./initdb-postgis.sh /docker-entrypoint-initdb.d/10_postgis.sh
COPY ./update-postgis.sh /usr/local/bin
### INSTALL TIMESCALEDB ###
# Important: Run timescaledb-tune #
# once for COMPLETELY NEW DATABASES, #
# so no existing postgresql_data. #
### ###
RUN apt-get update \
&& apt-get install -y postgresql-common wget lsb-release
RUN echo "yes" | sh /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
RUN sh -c "echo 'deb [signed-by=/usr/share/keyrings/timescale.keyring] https://packagecloud.io/timescale/timescaledb/debian/ $(ls$
RUN wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | gpg --dearmor -o /usr/share/keyrings/timescale.keyr$
RUN apt-get update \
&& apt-get install -y timescaledb-2-postgresql-13 timescaledb-tools
It seems like the error comes from doing
CREATE TEMPORARY TABLE unprocessed (
"timestamp" timestamp,
clientid text,
devicecpuid text,
customer text,
traffic bigint
) ON COMMIT DELETE ROWS;
As well as:
TRUNCATE TABLE unprocessed;
I did this initially because a test indicated the the ON COMMIT DELETE ROWS
wasn't really clearing the table after the COMMIT
in the middle of the procedure.
Leaving it out prevented the error from occuring and further tests showed that even without it the data was as expected. It seems to be some sort of race condition.
I will post this in the postgres github as well.