Search code examples
pythonpython-3.xamazon-web-servicesamazon-redshiftload

How to load data into redshift table using python?


Hi I have file in which columns are separated by '|' and rows are separated by ','. I want to load into table using the copy command. Could you suggest me how can I do this?


Solution

  • The Amazon Redshift COPY Command has a DELIMITER parameter that "specifies the single ASCII character that is used to separate fields in the input file, such as a pipe character ( | ), a comma ( , ), or a tab ( \t )."

    However, all rows must end with a newline (CR). It is not possible to specify that rows end with a comma.

    You would need to pre-process this file to convert it into a format compatible with the Amazon Redshift COPY command.

    Alternatively, since you mention Python, you could write a Python program to read the rows and then send the data to Redshift using INSERT statements. However, this is much less efficient that using a COPY command and also requires that you establish a connection to Redshift via an SQL client (eg psycopg2) or the Amazon Redshift Data API.