Search code examples
javaoracleutf-8substring

How to do substring for UTF8 string in java?


Suppose I have the following string: Rückruf ins Ausland I need to insert it into the database which has a max size of 10. I did a normal substring in java and it extracted this string Rückruf in which is 10 characters. When it tries to insert this column I get the following oracle error:

java.sql.SQLException: ORA-12899: value too large for column "WAEL"."TESTTBL"."DESC" (actual: 11, maximum: 10) The reason for this is that the database has a AL32UTF8 character set thus the ü will take 2 chars.

I need to write a function in java that does this substring but taking into consideration that the ü takes 2 bytes so the returned substring in this case should be Rückruf i (9 chars). Any suggestions?


Solution

  • If you want to trim the data in Java you must write a function that trims the string using the db charset used, something like this test case:

    package test;
    
    import java.io.UnsupportedEncodingException;
    
    public class TrimField {
    
        public static void main(String[] args) {
            //UTF-8 is the db charset
            System.out.println(trim("Rückruf ins Ausland",10,"UTF-8"));
            System.out.println(trim("Rüückruf ins Ausland",10,"UTF-8"));
        }
    
        public static String trim(String value, int numBytes, String charset) {
            do {
                byte[] valueInBytes = null;
                try {
                    valueInBytes = value.getBytes(charset);
                } catch (UnsupportedEncodingException e) {
                    throw new RuntimeException(e.getMessage(), e);
                }
                if (valueInBytes.length > numBytes) {
                    value = value.substring(0, value.length() - 1);
                } else {
                    return value;
                }
            } while (value.length() > 0);
            return "";
    
        }
    
    }