Search code examples
arrayspostgresqlcsvplpgsqlpostgresql-copy

Export an array into a CSV-file in PL/pgSQL


I have a function, which RETURNS SETOF text[]. Sample result of this function:

{080213806381,"personal data1","question 1",answer1,"question 2",answer2,"question 3","answer 3"}
{080213806382,"personal data1","question 1",answer1,"question 2",answer2,"question 3","answer 3"}

I'm forming each row with a statement like:

resultRow := array_append(resultRow, fetchedRow.data::text);

and then:

RETURN NEXT resultRow;

And here's my COPY command:

COPY( 
SELECT myFunction()
) TO 'D:\test_output.csv' WITH (FORMAT 'csv', DELIMITER E',', HEADER false)

And I have a couple of problems:

  1. Regardless the fact that values are appended to the array in the same way, some of them are double-quoted and some of them are not. This somehow depends on a presence of space character in a value. Look, for instance, at the 1st element of the array or at the answer2 and "answer 3" in each row. I want some unified behavior.
  2. After exporting in to CSV with COPY command I'm getting the same rows with all these curly braces at the beginning and the end. I dont want them in CSV.

What can I do to solve these issues?


Solution

  • You wish to export rows of varying numbers of columns. You're producing a set of arrays, but from there want to produce a CSV file.

    The immediate issue - array literals aren't CSV

    Your function returns text[] literals, i.e. PostgreSQL array literals.

    These are not CSV as commonly recognised. They're comma-separated, yes, but they follow different syntax rules. You can't reliably treat an array literal as a CSV row or vice versa.

    Don't attempt to just chop the delimiting {...} off and treat the array literal as a CSV row.

    COPY won't work well or at all

    COPY is not going to work well for you. It's designed to handle relations, i.e. uniform sets of structured rows where each column is of a well defined type and each row has the same number of columns.

    You could redefine your function to return a setof record and pad your records with nulls to always be the same width, but it'll be pretty ugly and limited, plus the CSV will then incorporate the nulls.

    What COPY will do is export a single column CSV containing array literals in a single CSV field. This certainly will not be what you want.

    Solution 1: Export client-side

    You might be better off doing this on the client side, via a script or program to generate the CSV. Have the program receive the set of arrays and then write it to CSV via a suitable library, like Python's csv module. Choose a client scripting language where the PostgreSQL driver understands arrays and can transform them to arrays in the language's format - again, like psycopg2 for Python.

    e.g. given dummy function:

    CREATE OR REPLACE FUNCTION get_rows() RETURNS setof text[] AS $$
    VALUES
    ('{080213806381,"personal data1","question 1",answer1,"question 2",answer2,"question 3","answer 3"}'::text[]),
    ('{080213806382,"personal data1","question 1",answer1,"question 2",answer2,"question 3","answer 3","q4","a4"}'::text[])
    $$ LANGUAGE SQL;
    

    a client script could be as simple as:

    #!/usr/bin/env python
    import psycopg2
    import csv
    
    with psycopg2.connect('dbname=craig') as conn:
        curs = conn.cursor()
    
        with open("test.csv","w") as csvfile:
            f = csv.writer(csvfile)
    
            curs.execute("SELECT * FROM get_rows()")
    
            for row in curs:
                f.writerow(row[0])
    

    Solution 2: Export CSV directly from a procedure

    Alternately, if the CSV document isn't too big, you could produce the entire CSV in a single procedure, perhaps using plpythonu and the csv module, or a similar CSV library for your preferred procedural language. Because the whole CSV document must be accumulated in memory this won't scale to very very large documents.