Search code examples
sql-serverlinuxbcp

bcp adding extra characters in CSV out


I'm working a bash script that has two commands:

/opt/mssql-tools/bin/bcp buyerhero.dbo.PeopleEmail OUT "D:/Data/people/PeopleEmail_06272020.csv" -F2  -U<<USER>> -P<<PASSWORD>> -S<<SERVER>> -t"§" -c
gsutil cp D:/Data/people/PeopleEmail_{date}.csv gs://offrs_table_files/PeopleEmail/To_Process/PeopleEmail_06272020.csv

The objective is to export out data from MSSQL and upload it to Google Big Query. The two commands above represent the preliminary.

While the commands work, I have a weird output coming from the BCP.

22§7279373929§Line§429734616§1434114003§0§1§Data§2017-07-29§963313836§§143414003§§§§§§§1
23§9097671326§Line§4973325331659§1491223321237§0§1§Data§2017-07-29§963383261§§149031237§§§§§§§1
24§9512444462§Line§2222§5431230333372§0§1§Data§2017-07-29§9852228534§§54303372§§§§§§§1

The delimiter is §. The Weird "A"'s that precede them are not in the table, but appear in the CSV.

I doublechecked the table in sql server, and they are not there. The BCP command is installed on a remote linux box.

This just started happening a few days ago. How do I troubelshoot this? Thanks!


Solution

  • Looks like the high ASCII character ( i.e > 127 ) is available in the data if self. The character that is coming up in the file is SELECT CHAR(194).

    You can remove this character from data (table PeopleEmail) by replacing  with blank (SELECT REPLACE([COLUMNNAME], CHAR(194), '')). However, i am not sure whether it has any meaning to your application or not. So, I would suggest to cross check and double confirm before doing any update on table data.

    You can also try using queryout by replacing character  with blank. Something like below

    bcp "SELECT REPLACE(COLUMN1+'§'+COLUMN2+'§'+COLUMN3+'§'+COLUMN4, CHAR(194), '') FROM [Database].[Schema].[Table]" queryout "File.csv" -U [Username] -P [Password] -S [Server]
    

    Not sure how long my answer can help you. But i would be happy if it can be helpful somehow.