I'm using psql
13.4 to make a pg_dump
of a database. I need only the schema (no data) for all tables except one enumerated reference table, for which I need the data. I see that pg_dump
has an --exclude-table-data
flag that supports regex patterns. But it does not have an --include-table-data
flag.
So to exclude table data for all tables except the single reference table, I thought I could use a regex with negative look ahead. For example this one: ^(?!.*(reference_table_name))
. However this doesn't seem to work due to psql's special character limitations (e.g. on *
, ?
, and .
) described here. I've tried to make the substitutions it suggests but it doesn't work.
Is what I'm trying to do even possible?
FWIW, this actually can be done with the restricted regular expressions available in this context – but whether you’d want to actually do this is another matter, because pretty it ain’t.
Namely, to exclude the data of all tables except sometable
, you have to write this:
(|[^s]*|s(|[^o]*|o(|[^m]*|m(|[^e]*|e(|[^t]*|t(|[^a]*|a(|[^b]*|b(|[^l]*|l(|[^e]*|e?*)))))))))
How does this monstrosity come about? The key to understand it is that it is a composition of copies of the following basic pattern:
(|[^s]*|s*)
This alternation will match one of the following:
So by itself, this construct is equivalent to *
and just matches anything. However, by separating the cases, it creates a hook for conditions to be added to any of them. (Any mathematicians reading this should instantly recognize this move.)
Now let’s vary the the positive case slightly:
(|[^e]*|e?*)
Now, the positive case matches the character “e” only if followed by at least one more character. So this pattern will match almost anything: anything except the string e
.
Now we take two of these patterns, one of each:
(|[^l]*|l*)
(|[^e]*|e?*)
… and we restrict the positive case of the first pattern by replacing its *
with the second pattern, i.e. we take “an ‘l’ followed by anything” and replace “anything” with “anything except just an ‘e’”:
(|[^l]*|l(|[^e]*|e?*))
So now we have a pattern which will, in total, match one the following things:
… which means this pattern matches any string except le
.
Now we iterate by taking another instance of the basic pattern:
(|[^b]*|b*)
(|[^l]*|l(|[^e]*|e?*))
… and repeating the same substitution:
(|[^b]*|b(|[^l]*|l(|[^e]*|e?*)))
By the same token as previously, this matches any string except ble
.
So. Let’s start over with a whole series of copies of the same pattern:
(|[^s]*|s*)
(|[^o]*|o*)
(|[^m]*|m*)
(|[^e]*|e*)
(|[^t]*|t*)
(|[^a]*|a*)
(|[^b]*|b*)
(|[^l]*|l*)
(|[^e]*|e?*)
Folding these together yields the pattern I gave at the start – which therefore is a pattern that matches any string except sometable
.
This is how you can do it. It’s up to you whether you’d actually want to. Maybe putting a link to this answer in a comment next to the code will make it more justifiable…