I have logs from various linux servers being fed by rsyslog to a PostgreSQL database. The incoming timestamp is an rsyslog'd RFC3339 formatted time like so: 2020-10-12T12:01:18.162329+02:00
.
In the original test setup of the database logging table, I created that timestamp field as 'text'. Most things I need parsed are working right, so I was hoping to convert that timestamp table column from text to a timestamp datatype (and retain the subseconds and timezone if possible).
The end result should be a timestamp datatype so that I can do date-range queries using PostgreSQL data functions.
Is this doable in PostgreSQL 11? Or is it just better to re-create the table with the correct timestamp column datatype to begin with?
Thanks in advance for any pointers, advice, places to look, or snippets of code.
Relevant rsyslog config:
$template CustomFormat,"%timegenerated:::date-rfc3339% %syslogseverity-text:::uppercase% %hostname% %syslogtag% %msg%\n"
$ActionFileDefaultTemplate CustomFormat
...
template(name="rsyslog" type="list" option.sql="on") {
constant(value="INSERT INTO log (timestamp, severity, hostname, syslogtag, message)
values ('")
property(name="timegenerated" dateFormat="rfc3339") constant(value="','")
property(name="syslogseverity-text" caseConversion="upper") constant(value="','")
property(name="hostname") constant(value="','")
property(name="syslogtag") constant(value="','")
property(name="msg") constant(value="')")
}
and the log table structure:
CREATE TABLE public.log
(
id integer NOT NULL DEFAULT nextval('log_id_seq'::regclass),
"timestamp" text COLLATE pg_catalog."default" DEFAULT timezone('UTC'::text, CURRENT_TIMESTAMP),
severity character varying(10) COLLATE pg_catalog."default",
hostname character varying(20) COLLATE pg_catalog."default",
syslogtag character varying(24) COLLATE pg_catalog."default",
program character varying(24) COLLATE pg_catalog."default",
process text COLLATE pg_catalog."default",
message text COLLATE pg_catalog."default",
CONSTRAINT log_pkey PRIMARY KEY (id)
)
some sample data already fed into the table (ignore the timestamps in the messsage, they are done with an independent handmade logging system by my predecessor):
You can in theory convert the TEXT
column to TIMESTAMP WITH TIME ZONE
with ALTER TABLE .. ALTER COLUMN ... SET DATA TYPE ... USING
, e.g.:
postgres=# CREATE TABLE tstest (tsval TEXT NOT NULL);
CREATE TABLE
postgres=# INSERT INTO tstest values('2020-10-12T12:01:18.162329+02:00');
INSERT 0 1
postgres=# ALTER TABLE tstest
ALTER COLUMN tsval SET DATA TYPE TIMESTAMP WITH TIME ZONE
USING tsval::TIMESTAMPTZ;
ALTER TABLE
postgres=# \d tstest
Table "public.tstest"
Column | Type | Collation | Nullable | Default
--------+--------------------------+-----------+----------+---------
tsval | timestamp with time zone | | not null |
postgres=# SELECT * FROM tstest ;
tsval
-------------------------------
2020-10-12 12:01:18.162329+02
(1 row)
PostgreSQL can parse the RFC3339
format, so subsequent inserts should just work:
postgres=# INSERT INTO tstest values('2020-10-12T12:01:18.162329+02:00');
INSERT 0 1
postgres=# SELECT * FROM tstest ;
tsval
-------------------------------
2020-10-12 12:01:18.162329+02
2020-10-12 12:01:18.162329+02
(2 rows)
But note that any bad data in the table (i.e. values which cannot be parsed as timestamps) will cause the ALTER TABLE
operation to fail, so you should consider verifying the values before converting the data. Something like SELECT "timestamp"::TIMESTAMPTZ FROM public.log
would fail with an error like invalid input syntax for type timestamp with time zone: "somebadvalue"
.
Also bear in mind this kind of ALTER TABLE
requires a table rewrite which may take some time to complete (depending on how large the table is), and which requires a ACCESS EXCLUSIVE
lock, rendering the table inaccessible for the duration of the operation.
If you want to avoid a long-running ACCESS EXCLUSIVE
lock, you could probably do something like this (not tested):
TIMESTAMPTZ
column (adding a column doesn't rewrite the table and is fairly cheap provided you don't use a volatile default value)UPDATE public.foo SET newlog = log::TIMESTAMPTZ