I'm trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application. However, python is complaining about UnicodeDecodeErrors (for example, for the non-ascii characters in Puerto Rico).
The shapefile description file (*.dbf) doesn't specify what character encoding it uses; this is not defined by the spec for shapefiles. What is the correct character encoding to use?
The US Census cartographic boundary files use the IBM850
character encoding. Python code to properly encode these strings would be as follows:
unicode(featurestring.decode("IBM850"))