Search code examples
javaandroidsqliteandroid-sqlitestring-length

For a same String, will SQLite's length will ever return a different value than Java's length method?


Give a same String data

  1. SQLite perform length calculation on its TEXT column.
  2. The TEXT column is read into (Using Android Room database) Java String, then Java performs String.length()

Is there any chance that these yields 2 different value?

I have do a rough test using English and non-English characters. Both yields the same value.

But, I am not sure whether there is any edge cases I have missed out?


Solution

  • There could be some cases where the length differ, Java uses UTF-16 for internal string representation, so some kind of characters will need a surrogate pair to be stored in memory. Java's String.length() does not take into account this.

    A simple example using the 💩 emoji character

        class HelloWorld {
        public static void main(String[] args) {
            System.out.println("💩".length());
        }}
    

    This will print 2.

    On the other hand the documentation of sqlite states:

    For a string value X, the length(X) function returns the number of characters (not bytes) in X prior to the first NUL character.

    It specifies that it counts the characters

    sqlite> select length('💩'); 
    

    this will return 1.

    This is not exclusive to "emojis" it will be the same also for some languages that have characters with "high" codepoints like some Asian characters

    tested with sqlite 3.28.0 and openjdk version "1.8.0_252". I think it should hold true for your stack.