Give a same String data
length
calculation on its TEXT
column.TEXT
column is read into (Using Android Room database) Java String, then Java performs String.length()
Is there any chance that these yields 2 different value?
I have do a rough test using English and non-English characters. Both yields the same value.
But, I am not sure whether there is any edge cases I have missed out?
There could be some cases where the length differ, Java uses UTF-16 for internal string representation, so some kind of characters will need a surrogate pair to be stored in memory. Java's String.length() does not take into account this.
A simple example using the 💩 emoji character
class HelloWorld {
public static void main(String[] args) {
System.out.println("💩".length());
}}
This will print 2.
On the other hand the documentation of sqlite states:
For a string value X, the length(X) function returns the number of characters (not bytes) in X prior to the first NUL character.
It specifies that it counts the characters
sqlite> select length('💩');
this will return 1.
This is not exclusive to "emojis" it will be the same also for some languages that have characters with "high" codepoints like some Asian characters
tested with sqlite 3.28.0 and openjdk version "1.8.0_252". I think it should hold true for your stack.