Search code examples
postgresqldockerrestoredatabase-backups

Cannot restore data from pg_dump due to blank strings being treated as nonexistant data


I have a database currently on a PostgreSQL 9.3.9 server that I am backing up with pgdump in the simplest possible fashion, eg pg_dump orb > mar_9_2018.db.

One of those tables (linktags) has the following definition:

CREATE TABLE linktags (
    linktagid integer NOT NULL,
    linkid integer,
    tagval character varying(1000)
);

When attempting to restore the database on PostgreSQL 11.2 via

cat mar_9_2018.db | docker exec -i pg-docker psql -U postgres

(docker container restore) the table returns empty because of the following error -

ERROR:  missing data for column "tagval"
CONTEXT:  COPY linktags, line 737: "1185    9325"
 setval

I checked the db file and found that there are missing tabs where I would expect some sort of information, and clearly the restore process does as well.

demonstration of tabs missing with regular expressions

I also verified that the value in the database is a blank string.

So -

  • Is there an idiomatic method to backup and restore a postgres database I am missing?
  • Is my version old enough that this version of pg_dump should have some special considerations? Am I just restoring this wrong?

Edit: I did some further research and found that I was incorrect in the original checking of NULLs, it was instead blank strings that are causing the issue.

If I make an example table with null strings and then blank strings, I can see the NULLs get a newline but the blank does not null values showing a \n and blanks showing no \n


Solution

  • pg_dump has an option to use INSERT instead of COPY

    pg_dump -d db_name --inserts
    

    as the manual warns, it might make restoration slow (and much larger dump file). Even in case of some inconsistencies tables will be filled with valid rows.

    Another problem is with empty tables, pg_dump generates empty copy statement like:

    COPY config (key, value) FROM stdin;
    \.
    

    in this case you'll be getting errors on reimport like:

     ERROR:  invalid input syntax for type smallint: "  "
    CONTEXT:  COPY config, line 1, column group: "  "
    

    which doesn't happen with --insert option (no insert statement is generated).