I'm tring to parse a csv file, but it's head start with character ZWNBSP
which make my code fail, how can i parse this csv as expected?
csv content
ZWNBSP"AccountId","AccountNumber","AccountName","CustomerId","CustomerName","CurrencyCode","Spend","Impressions","Clicks","Installs","VideoViews","Conversions","Sales","TimePeriod"
code trial
package main
import (
"encoding/csv"
"fmt"
"os"
"strings"
)
func main() {
file, err := os.Open("/Desktop/193657270154964993.csv")
if err != nil {
fmt.Println("Error:", err)
return
}
defer file.Close()
reader := csv.NewReader(file)
for {
record, err := reader.ReadAll()
if err != nil {
break
}
fmt.Println(record)
}
}
error info
bare " in non-quoted-field
I recommend using the Golang's own x/text package: it has a Transformer type and BOM-aware encodings that can handle removing (and inserting) a BOM with just one extra line of code.
I created this small sample CSV which has a UTF-8BOM:
A,B
1,x
2,y
3,z
If I run this small program:
func main() {
var r io.Reader
r, _ = os.Open("input.csv")
csvr := csv.NewReader(r)
records, _ := csvr.ReadAll()
fmt.Printf("%q\n", records)
}
it prints, with the BOM before "A":
[["\ufeffA" "B"] ["1" "x"] ["2" "y"] ["3" "z"]]
If I modify that program and a add a Transformer which specifically decodes UTF-8BOM encoded bytes (to just UTF-8):
import (
...
"golang.org/x/text/encoding/unicode"
"golang.org/x/text/transform"
)
func main() {
var r io.Reader
r, _ = os.Open("input.csv")
r = transform.NewReader(r, unicode.UTF8BOM.NewDecoder())
csvr := csv.NewReader(r)
records, _ := csvr.ReadAll()
fmt.Printf("%q\n", records)
}
it prints:
[["A" "B"] ["1" "x"] ["2" "y"] ["3" "z"]]
I chose to generically declare r as io.Reader so I could use the same variable and take that transformer-line in and out, for a compact example. You could also write something more explicit and idiomatic, like:
fIn, _ := os.Open("input.csv")
defer fIn.Close()
bomDecoder := transform.NewReader(fIn, unicode.UTF8BOM.NewDecoder())
csvReader := csv.NewReader(bomDecoder)
If you need to re-encode with a BOM when you're done processing the CSV, create a writer with a BOM encoder:
fOut, _ := os.Create("output.csv")
defer fOut.Close()
t := transform.NewWriter(fOut, unicode.UTF8BOM.NewEncoder())
csvw := csv.NewWriter(t)
csvw.WriteAll(records)
% hexdump -C output.csv
00000000 ef bb bf 41 2c 42 0a 31 2c 78 0a 32 2c 79 0a 33 |...A,B.1,x.2,y.3|
00000010 2c 7a 0a |,z.|
00000013