Search code examples
pythonapache-sparkamazon-kinesisamazon-kcl

exception when python KCL connects to kinesis stream


I am trying to integrate kinesis in spark streaming and for that I am using python and KCL. I get this exception most of the times when reading from kinesis

'utf8' codec can't decode byte 0xf1 in position 940: invalid continuation byte

Can someone please let me know how can I solve this problem This is how I create the stream

kinesisStream = KinesisUtils.createStream(ssc, APPLICATION_NAME, STREAM_NAME, ENDPOINT, REGION_NAME, INITIAL_POS, CHECKPOINT_INTERVAL, awsAccessKeyId =AWSACCESSID, awsSecretKey=AWSSECRETKEY) 

Solution

  • You should check to make sure data coming into the stream is UTF-8.

    Trying to decode Latin-1 (ISO-8859-1) as UTF-8 can be one cause of this type of error.