Search code examples
mysqlutf-8character-encodingphpbb

What is my best option for converting my phpbb2 latin1 DB to a phpbb3 utf8 DB?


I am upgrading a phpBB 2.x forum to phpBB 3.x and I'm trying to figure out what the best option is for converting to utf8 from the previous latin1 encoding. Right now I'm still just working on my phpBB2 database dump file. I used sed to update the CHARSET and SET NAMES statements in the dump file and then tried running it through iconv:

cat phpbb2.sql | sed 's/SET NAMES latin1/SET NAMES utf8/g' > tmp
mv tmp phpbb2_utf8.sql

cat phpbb2_utf8.sql | sed 's/CHARSET=latin1/CHARSET=utf8/g' > tmp
mv tmp phpbb2_utf8.sql

iconv -f latin1 -t utf8  phpbb2_utf8.sql > phpbb2_utf8_iconv.sql

This is no good. All sorts of garbage. Do you think I should just use latin1 on the new phpBB3 installation?


Solution

    1. Export phpBB2 database to the plain .sql file.
    2. Change encoding of that file from latain1 to Unicode UTF-8 (iconv).
    3. Change all occurrences of DEFAULT CHARACTER SET, SET NAMES etc. from latain1 to utf8.
    4. Change all occurrences of COLLATION / COLLATE from latain1_*_ci to utf8_unciode_ci
    5. Run phpBB2 to phpBB3 converter.