I'm trying to update a table that was working fine a minute ago, but now I've hit an error that I've got in the past that's extremely inconvenient.
09:32:57 Copying rows caused a MySQL error 1300:
Level: Warning
Code: 1300
Message: Invalid utf8mb4 character string: '94C494'
Is there an option or something so that I can ignore these warnings? After all it is just a warning, so MySQL still has the ability to move forward and copy the rows anyway.
Other solutions would be fine, too, like a way to find the offending value in the table, or to remove the invalid pieces, or just something so I can alter my table would be amazing.
My Googling on how to find/replace broken utf8 sequences in MySQL bring to me to these links (that aren't so helpful)
I've even tried searching for all columns where the hex contains the invalid sequence and still somehow with no luck
select * from `notifications`
where hex(`Description`) like '%94C494%'
or hex(`Title`) like '%94C494%'
or hex(`NotificationID`) like '%94C494%'
or hex(`ToUserID`) like '%94C494%'
or hex(`FromUserID`) like '%94C494%'
or hex(`Link`) like '%94C494%'
or hex(`Icon`) like '%94C494%';
MySQL is version 5.7.18-15-57-log and pt-online-schema-change 3.0.8
Even stranger, I decided to humor myself and search ALL columns (not just the utf8mb4 ones) and I got rows! But the only rows I got were from my Binary columns? Why would having invalid utf8 sequences matter in a binary column? Now I think it might be a bug with the tool
select * ,hex(`notificationid`), hex(`notificationbid`), hex(`fromuserid`), hex(`touserid`), hex(`title`), hex(`description`), hex(`read`), hex(`datetimeadded`), hex(`link`), hex(`icon`), hex(`shown`), hex(`_linkdescriptionsha256`), hex(`_touseridlinkdescription+sha3-224`)
from `notifications`
where hex(`notificationid`) like '%94C494%'
or hex(`notificationbid`) like '%94C494%'
or hex(`fromuserid`) like '%94C494%'
or hex(`touserid`) like '%94C494%'
or hex(`title`) like '%94C494%'
or hex(`description`) like '%94C494%'
or hex(`read`) like '%94C494%'
or hex(`datetimeadded`) like '%94C494%'
or hex(`link`) like '%94C494%'
or hex(`icon`) like '%94C494%'
or hex(`shown`) like '%94C494%'
or hex(`_linkdescriptionsha256`) like '%94C494%'
or hex(`_touseridlinkdescription+sha3-224`) like '%94C494%';
After working with Percona support on this issue, we eventually led to the creation of this ticket: https://jira.percona.com/browse/PT-1528
The trick for ignoring collation issues (or working around this bug) is by adding the --charset binary
flag to the pt-online-schema-change
command.
Seems to happen when the primary key is a binary column and the charset is either set to utf8(mb4) or inferred as one of those from the MySQL settings.