Search code examples
sqlpostgresqlmongodbcsvmongoexport

PostgreSQL COPY command from csv file problem for array data


I'm working on migrating data from MongoDB to PostgreSQL. I'm using the mongoexport command to export collections into a CSV file, and then I attempt to import the data from the CSV file into a PostgreSQL table using the COPY command.

By default, array fields in the CSV file are formatted as "[""val1"",""val2""]". When I try to execute the COPY command with this CSV file, I'm getting the following error:

ERROR:  "[" must introduce explicitly-specified array dimensions.malformed array literal: "["TRT100925T18"]" 

ERROR:  malformed array literal: "["TRT100925T18"]"
SQL state: 22P02
Detail: "[" must introduce explicitly-specified array dimensions.
Context: COPY sgmk_hazine_tanim, line 2, column isins: "["TRT100925T18"]"

To resolve this, I manually correct the array format in Notepad++ to {val1,val2}. However, when I run the COPY command again, I receive another error because the commas within the array are interpreted as new fields:

ERROR:  extra data after last expected column
CONTEXT:  COPY sgmk_hazine_tanim, line 5: "2024-09-24T05:50:00.114Z,2025-03-11T21:00:00.000Z,121,{TRT120325T12,TRT120325T20},121T2,85.640,4" 

SQL state: 22P04

Here is my table structure:

CREATE TABLE sgmk_hazine_tanim (                   
    id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    last_update timestamp NOT NULL,
    matDate date NOT NULL,
    section smallint NOT NULL,
    isins text[] NOT NULL,
    CBRTCode VARCHAR(32) NOT NULL,
    presVal VARCHAR(32) NOT NULL,
    payRate NUMERIC(8,4)
);

The sgmk_hazine_tanim.csv file for the first couple of rows looks like this:

last_update,matDate,section,isins,CBRTCode,presVal,payRate
2024-09-24T05:50:00.114Z,2025-09-09T21:00:00.000Z,111,{TRT100925T18},13T,69.782,
2024-09-24T05:50:00.114Z,2024-10-15T21:00:00.000Z,111,{TRT161024T10},15T,97.120,
2024-09-24T05:50:00.114Z,2025-04-08T21:00:00.000Z,111,{TRT090425T16},15T,80.060,
2024-09-24T05:50:00.114Z,2025-03-11T21:00:00.000Z,121,{TRT120325T12,TRT120325T20},121T2,85.640,4

And here is my COPY command:

COPY sgmk_hazine_tanim(last_update,matDate,section,isins,CBRTCode,presVal,payRate) FROM '/home/srvadmin/docker/build/aktarimlar/sgmk_hazine_tanim.csv' DELIMITER ',' CSV HEADER;

This is the mongoexport command I used:

mongoexport --uri=$URI --authenticationDatabase=$AUTH_DB --db=$DB_NAME --collection=$COLLECTION_NAME --type=csv --fields=$FIELDS --out=$OUTPUT_FILE

I considered changing the CSV delimiter from a comma to a pipe (|) and also from copy command, which might help fix the problem, but I would prefer not to deal with that. If you have any knowledge or suggestions on how to handle this issue i would greatly appreciate.


Solution

  • I solved the issue with changing the format from "[""val1"",""val2""]" to "{""val1"","val2""}" with notepad++, the copy command imported it without any problems