Search code examples
swiftswift-string

Why is using isEmpty preferable over a comparison with an empty string literal is Swift?


The String type is Swift has a property named isEmpty that indicates whether a string has no characters.

I'd like to know if there's any difference between using isEmpty and checking the equality to an empty string literal. In other words, is myString.isEmpty any better than myString == ""?

I did some research and came across the following recommendations:

  1. String struct reference at Apple developer documentation (as well as the Swift Language Guide) recommends using isEmpty instead of checking the string's length:

To check whether a string is empty, use its isEmpty property instead of comparing the length of one of the views to 0. Unlike with isEmpty, calculating a view’s count property requires iterating through the elements of the string.

  1. Answer to a slightly different question from 2015 on StackOverflow by Rob Napier states the following:

    The empty string is the only empty string, so there should be no cases where string.isEmpty() does not return the same value as string == "". They may do so in different amounts of time and memory, of course. Whether they use different amounts of time and memory is an implementation detail not described, but isEmpty is the preferred way to check in Swift (as documented).

  2. Some blog posts also recommend using isEmpty especially instead of checking if the string's length is 0.

None of these sources however say anything against comparing with an empty literal.

It seems totally reasonable to avoid constructions like myString.count == 0 because of obvious performance and readability drawbacks. I also get the fact that myString.isEmpty is more readable than myString == "".

Still, I'm curious whether the comparison with an empty string literal is bad. Does it really have any memory or performance implications? Perhaps the Swift compiler is so smart these days that it will produce the same binary code for myString.isEmpty and myString == ""? My gut feeling is that the difference should be negligible or even non-existent but I don't have proofs.

I realize it's not really a practical question though, however I'll be grateful if someone could share some insights how these two alternatives work on a lower level and whether there are any differences. Thank you all in advance.


Solution

  • As a note, isEmpty is the preferred/recommended method to check the emptiness of a collection, as all Collection types guarantee that isEmpty returns in O(1) (or at least this holds for the standard library collections). The equality operator make no such guarantees, thus if you're only interested about the collection having or not having elements (e.g. to launch a processing operation), then isEmpty is definitively the way to go.

    Now, to see what's happening under the hood when using isEmpty vs when comparing to an empty string, we can use the generated assembly.

    func testEmpty(_ str: String) -> Bool { str.isEmpty }
    

    results in the following assembly code:

                         _$s3CL29testEmptyySbSSF:
    0000000100002c70         push       rbp
    0000000100002c71         mov        rbp, rsp
    0000000100002c74         mov        rax, rsi
    0000000100002c77         shr        rax, 0x38
    0000000100002c7b         and        eax, 0xf
    0000000100002c7e         movabs     rcx, 0xffffffffffff
    0000000100002c88         and        rcx, rdi
    0000000100002c8b         bt         rsi, 0x3d
    0000000100002c90         cmovae     rax, rcx
    0000000100002c94         test       rax, rax
    0000000100002c97         sete       al
    0000000100002c9a         pop        rbp
    0000000100002c9b         ret        
    

    while

    func testEqual(_ str: String) -> Bool { str == "" }
    

    generates this:

                         _$s3CL29testEqualySbSSF:
    0000000100002cd0         push       rbp
    0000000100002cd1         mov        rbp, rsp
    0000000100002cd4         movabs     rcx, 0xe000000000000000
    0000000100002cde         test       rdi, rdi
    0000000100002ce1         jne        0x100002cec
    
    0000000100002ce3         cmp        rsi, rcx
    0000000100002ce6         jne        0x100002cec
    
    0000000100002ce8         mov        al, 0x1
    0000000100002cea         pop        rbp
    0000000100002ceb         ret        
    
    0000000100002cec         xor        edx, edx                                    ; XREF=_$s3CL29testEqualySbSSF+17, _$s3CL29testEqualySbSSF+22
    0000000100002cee         xor        r8d, r8d
    0000000100002cf1         pop        rbp
    0000000100002cf2         jmp        imp___stubs__$ss27_stringCompareWithSmolCheck__9expectingSbs11_StringGutsV_ADs01_G16ComparisonResultOtF
                            ; endp
    

    Both assemblies are generated in the Release mode, with all optimizations enabled. Seems that for the isEmpty call the compiler is able to take some shortcuts since it knows about the internal String structure.

    But we can take that away by making our functions generic:

    func testEmpty<S: StringProtocol>(_ str: S) -> Bool { str.isEmpty }
    

    produces

                         _$s3CL29testEmptyySbxSyRzlF:
    0000000100002bd0         push       rbp
    0000000100002bd1         mov        rbp, rsp
    0000000100002bd4         push       r13
    0000000100002bd6         push       rax
    0000000100002bd7         mov        rax, rsi
    0000000100002bda         mov        rcx, qword [ds:rdx+8]
    0000000100002bde         mov        rsi, qword [ds:rcx+8]
    0000000100002be2         mov        r13, rdi
    0000000100002be5         mov        rdi, rax
    0000000100002be8         call       imp___stubs__$sSl7isEmptySbvgTj
    0000000100002bed         add        rsp, 0x8
    0000000100002bf1         pop        r13
    0000000100002bf3         pop        rbp
    0000000100002bf4         ret        
                            ; endp
    

    , while

    func testEqual<S: StringProtocol>(_ str: S) -> Bool { str == "" }
    

    produces

                         _$s3CL29testEqualySbxSyRzlF:
    0000000100002c00         push       rbp
    0000000100002c01         mov        rbp, rsp
    0000000100002c04         push       r14
    0000000100002c06         push       r13
    0000000100002c08         push       rbx
    0000000100002c09         sub        rsp, 0x18
    0000000100002c0d         mov        r14, rdx
    0000000100002c10         mov        r13, rsi
    0000000100002c13         mov        rbx, rdi
    0000000100002c16         mov        qword [ss:rbp+var_28], 0x0
    0000000100002c1e         movabs     rax, 0xe000000000000000
    0000000100002c28         mov        qword [ss:rbp+var_20], rax
    0000000100002c2c         call       _$sS2SSysWl
    0000000100002c31         mov        rcx, qword [ds:imp___got__$sSSN]
    0000000100002c38         lea        rsi, qword [ss:rbp+var_28]
    0000000100002c3c         mov        rdi, rbx
    0000000100002c3f         mov        rdx, r13
    0000000100002c42         mov        r8, r14
    0000000100002c45         mov        r9, rax
    0000000100002c48         call       imp___stubs__$sSysE2eeoiySbx_qd__tSyRd__lFZ
    0000000100002c4d         add        rsp, 0x18
    0000000100002c51         pop        rbx
    0000000100002c52         pop        r13
    0000000100002c54         pop        r14
    0000000100002c56         pop        rbp
    0000000100002c57         ret        
                            ; endp
    

    Similar results, the isEmpty code results is less assembly code, which makes it faster.