oracle-database utf-8 data-conversion latin1

How to determine if characters in an Oracle DB field are within the UTF8 charset but outside of LATN-1 with SQL?

I currently have the need to write a SQL query to determine the number of rows in my UTF8 Oracle database that are not compatible with another system that uses LATIN-1.

For example, Über should not return a result, but 翻译 should

I have tried queries such as:

select decode(convert(convert('Über test', 'WE8ISO8859P1'), 'UTF8'), convert('Über test', 'UTF8'), 1, 0) from dual;

However, this does not give me the result that I need. Can anyone provide a SQL-only solution to this problem? Thanks

Solution

I think I have figured it out:

select * from (select asciistr(convert('test string goes here', 'UTF8')) as str from dual) where regexp_like(str, '.*\\([1-9A-F]|0[1-9A-F]).*');

Using http://en.wikipedia.org/wiki/Latin-1_Supplement_%28Unicode_block%29 as a reference, the LATIN-1 block of unicode ends at \00FF.

For example,

SQL> select * from (select asciistr(convert('翻译', 'UTF8')) as str from dual) where regexp_like(str, '.*\\([1-9A-F]|0[1-9A-F]).*');

STR
------------------------------
\7FFB\8BD1

If someone could double-check this from a logical standpoint, I would appreciate it.