Search code examples
postgresqlsqlitepgloader

How to fix "the octet sequence #(130) cannot be decoded." in pgloader


I'm trying to migrate a database from sqlite to postgresql using pgloader. My sqlite db is data.db, so i try this

pgloader ./var/data.db postgres://***@ec2-54-83-50-174.compute-1.amazonaws.com:5432/mydb?sslmode=require

Output:

pgloader version 3.6.1
sb-impl::*default-external-format* :UTF-8
tmpdir: #P"/var/folders/65/x6spw10s4jgd3qkhdq96bk8c0000gn/T/"
KABOOM!

2019-04-11T19:22:47.022000+01:00 NOTICE Starting pgloader, log system is ready.

FATAL error: :UTF-8 stream decoding error on #<SB-SYS:FD-STREAM for "file /Users/mackbookpro/Desktop/dev/www/Beyti/var/data.db" {1005892A93}>: the octet sequence #(130) cannot be decoded.

Date/time: 2019-04-11-18:22An unhandled error condition has been signalled: :UTF-8 stream decoding error on #<SB-SYS:FD-STREAM for "file /Users/mackbookpro/Desktop/dev/www/Beyti/var/data.db" {1005892A93}>: the octet sequence #(130) cannot be decoded.

An idea about this problem? thank you in advance


Solution

  • This is a character encoding issue.

    The culprit "octet sequence #(130)" corresponded to "é" in my case, which was encoded as \x82. iconv failed. I replaced in the byte stream those corrupted \x82 with \x65 (ascii char "e"), and I got out of it.

    <bad_file xxd -c1 -p | sed s/82/65/ | xxd -r -p > good_new_file
    

    (cheers to Natacha on irc freenode #gcu :) ) Edit : French issues? same problem with #133 "à", same solution \x85 -> \x61

    Edit 2 : A little generalization I just found : The "octet sequence" pgloader refers to, is the decimal ranking of the ascii table. When you get higher than 127 in the "octet sequence", you step in the extended ascii table and generate errors. I just got an issue with #144? It is \x90. replace :)