Search code examples
regexoracle-sqldeveloperoracle12c

regular expression to find unreadable characters in the file name


I have a huge file containing 4.1 million records and need to find these - Clock Accuracy – SM111.ppt kind of files which have unreadable characters. Another such exampole is - 241395 - Ansprüche.doc

How to match this using regular expression. I am using oracle 12c database


Solution

  • This looks a lot like a problem with the character encoding of your file. The file appears to be UTF-8-encoded: ü stands for ü, which makes Ansprüche.doc make sense. – encodes the N-dash () and so on.

    So you need to open the file using UTF-8 as its encoding, then the correct characters should appear (unless the file is corrupted by using several encodings at once).