Search code examples
postgresqldocumentationdata-warehouse

How to approach data warehouse (PostgreSQL) documentation?


We do have a small data warehouse in PostgreSQL database and I have to document all the tables.

I thought I can add a comment to every column and table and use pipe "|" separator to add more attributes. Then I can use information schema and array function to get documentation and use any reporting software to create desired output.

select
    ordinal_position,
    column_name,
    data_type,
    character_maximum_length,
    numeric_precision,
    numeric_scale,
    is_nullable,
    column_default,
    (string_to_array(descr.description,'|'))[1] as cs_name,
    (string_to_array(descr.description,'|'))[2] as cs_description,
    (string_to_array(descr.description,'|'))[3] as en_name,
    (string_to_array(descr.description,'|'))[4] as en_description,
    (string_to_array(descr.description,'|'))[5] as other
from 
    information_schema.columns columns
    join pg_catalog.pg_class klass on (columns.table_name = klass.relname and klass.relkind = 'r')
    left join pg_catalog.pg_description descr on (descr.objoid = klass.oid and descr.objsubid = columns.ordinal_position)
where 
    columns.table_schema = 'data_warehouse'
order by 
    columns.ordinal_position;

It is a good idea or is there better approach?


Solution

  • Unless you must include descriptions of the system tables, I wouldn't try to shoehorn your descriptions into pg_catalog.pg_description. Make your own table. That way you get to keep the columns as columns, and not have to use clunky string functions.

    Alternatively, consider adding specially formatted comments to your master schema file, along the lines of javadoc. Then write a tool to extract those comments and create a document. That way the comments stay close to the thing they're commenting, and you don't have to mess with the database at all to produce the report. For example:

    --* Used for authentication.
    create table users
    (
      --* standard Rails-friendly primary key.  Also an example of
      --* a long comment placed before the item, rather than on the 
      --* the same line.
      id serial primary key,
      name text not null,     --* Real name (hopefully)
      login text not null,    --* Name used for authentication
      ...
    );
    

    Your documentation tool reads the file, looks for the --* comments, figures out what comments go with what things, and produces some kind of report, e.g.:

    table users: Used for authentication
      id: standard Rails-friendly primary key.  Also an example of a
          long comment placed before the item, rather than on the same
          line.
      name: Real name
      login: Name used for authentication
    

    You might note that with appropriate comments, the master schema file itself is a pretty good report in its own right, and that perhaps nothing else is needed.