Search code examples
pythongoogle-app-enginecharacter-encodingbulkloader

Properly encoding text in the bulk uploader


What is the proper way to encode strings for the bulk uploader. It is currently bailing out when it runs into an apostrophe inside my text fields.

Here's a sample CSV file:

demo,name,message
FALSE,one,"Welcome message"
FALSE,two,"If you’re having a medical emergency"

Here's my bulkloader.yaml:

transformers:
- kind: Message
  connector: csv
  connector_options:
   encoding: utf-8
   columns: from_header
  property_map:
   - property: demo
     external_name: demo
     import_transform: bool
   - property: name
     external_name: name
     import_transform: str
   - property: message
     external_name: message
     import_transform: str

When I run the loader with a sample like this (that has apostrophes in the text), I'll get the following error...

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position x: ordinal not in range(128)

Any help is appreciated.


Solution

  • isn't an ASCII character. You should try changing the property transform to be import_transform: unicode