Hi I'm trying using kinesis firehose with S3. And I tried to read those s3 files. I'm using GO to read it.
However, I can't parse the JSON because the values are only appending without any delimiter.
here's the example of the file (note that the original input is appending to each other, I split them by a newline for formatting purposes):
{"ticker_symbol":"PLM","sector":"FINANCIAL","change":-0.16,"price":19.99}
{"ticker_symbol":"AZL","sector":"HEALTHCARE","change":-0.78,"price":16.51}
{"ticker_symbol":"IOP","sector":"TECHNOLOGY","change":-1.98,"price":121.88}
{"ticker_symbol":"VVY","sector":"HEALTHCARE","change":-0.56,"price":47.62}
{"ticker_symbol":"BFH","sector":"RETAIL","change":0.74,"price":16.61}
{"ticker_symbol":"WAS","sector":"RETAIL","change":-0.6,"price":16.72}
my question is, how can I parse it in Go? one solution that I can think of is to split them by }{
and append them again. But it's pretty hackish.
Or does kinesis firehose provides delimiter?
------UPDATE------
currently I have implemented the solution with replacing all }{
with },{
and then add [
at the beginning and ]
at the end. Then parse it.
However I'm still looking for alternatives as this solution would restrict any }{
in the content of the json object
Create a simple struct to unmarshal the json which is coming in batches. So each batch json is unmarshalled in to a json object. Then create a slice of structs to append the parsed json into the slice. This will append you result json all in slice of struct.
package main
import (
"encoding/json"
"fmt"
)
type Ticker struct {
TickerSymbol string `json:"ticker_symbol"`
Sector string `json:"sector"`
Change float64 `json:"change"`
Price float64 `json:"price"`
}
var jsonBytes = []byte(`{"ticker_symbol":"PLM","sector":"FINANCIAL","change":-0.16,"price":19.99}`)
func main() {
var singleResult Ticker
var result []Ticker
if err := json.Unmarshal(jsonBytes, &singleResult); err != nil {
fmt.Println(err)
}
if len(result) == 0 {
result = append(result, singleResult)
}
fmt.Printf("%+v", result)
}
Edited:
If the data is coming in batch which contains json objects appended to each other than you can go for regex expression to replace }
with },
and then trim right most ,
to make a valid json array of objects as:
package main
import (
"fmt"
"regexp"
"strings"
)
type Ticker struct {
TickerSymbol string `json:"ticker_symbol"`
Sector string `json:"sector"`
Change float64 `json:"change"`
Price float64 `json:"price"`
}
var str = `{"ticker_symbol":"PLM","sector":"FINANCIAL","change":-0.16,"price":19.99}
{"ticker_symbol":"AZL","sector":"HEALTHCARE","change":-0.78,"price":16.51}
{"ticker_symbol":"IOP","sector":"TECHNOLOGY","change":-1.98,"price":121.88}
{"ticker_symbol":"VVY","sector":"HEALTHCARE","change":-0.56,"price":47.62}
{"ticker_symbol":"BFH","sector":"RETAIL","change":0.74,"price":16.61}
{"ticker_symbol":"WAS","sector":"RETAIL","change":-0.6,"price":16.72}`
func main() {
r := regexp.MustCompile("}")
output := strings.TrimRight(r.ReplaceAllString(str, "},"), ",")
output = fmt.Sprintf("[%s]", output)
fmt.Println(output)
}
Using r := regexp.MustCompile("}")
will help you not to worry about whitespaces in between }{
which will interfere in replacing the string. So just replace }
with },
and then trim right.
Also The reason I am using MustCompile is:
When creating constants with regular expressions you can use the MustCompile variation of Compile. A plain Compile won’t work for constants because it has 2 return values.
Full Working code with json parse on Go playground