swift string append mechanics

When I append a string to another string in Swift with the += operator, what process is executed in the code? If I do str1+=str2, is memory allocated at the end of str1 and str2 is copied there, or is memory allocated for the combined length of the new string, then the strings are copied across? I wrote a linear approach of appending strings together and a bisecting approach and timed the two approaches. The bisecting approach is only 2 times faster than the linear approach. Are there any mechanisms in the append process that would account for the times of these two approaches being so similar?

    var start = NSDate().timeIntervalSince1970
    let _ = stringMe(400000000, str:"Developers! ")
    var end = NSDate().timeIntervalSince1970

    var duration = end - start;
    print("stringMe takes: \(duration)");


    start = NSDate().timeIntervalSince1970
     let _ = stringMe2(400000000, str:"Developers! ")
     end = NSDate().timeIntervalSince1970

    duration = end - start;
    print("stringMe2 takes: \(duration)");


func stringMe(n:Int, str:String)-> String {

    var string = ""

    for _ in 0..<n{
        string += str
    }


    return string
}

func stringMe2(n:Int, str:String)->String
{

    var string = str

    var currentWritten = 1

    while currentWritten < n {

        if currentWritten*2>n {

            string += stringMe2(n-currentWritten, str: str)

            break

        }


        currentWritten*=2
        string+=string

    }

    return string

}

Solution

I don't know how much memory your computer has but we're looking at 4.8G strings in the final set of concatenations. This means that heap management will be less likely to cause page faults when you grow your string sequentially.

Let's posit that each string concatenation allocates a new chunk of memory, then copies both strings to get the result and then frees the memory of the originals to be reused later.

Sequentially, from the 3rd concatenation onward, the memory freed from the previous two operations is large enough to hold the next result. That will not cause page faults until it gets very near the limit (if it even does).

Exponential copy on the other hand will never find reusable memory large enough for the next result and will reach the virtual memory trashing threshold, a point where it needs to page out portions of both sources and page in-out some of the destination.

Sequential concatenation never needs more memory than the size of the source and destination plus one instance of the string : i.e. 2n + 1 times the string ( approx: 9.6Gb ). On the next two operations it will have released n and n+1 which will give it enough room to hold at least one copy of n+3.

Exponential concatenation needs one copy of the source and a destination that is twice as large. n + 2*n in memory for the copy operation (approx. 14.4Gb). On the next operations it will release 3*n but the chunk of memory needed for the result will be 4*n, then with 7*n released it will need 8*n and so on, never finding a large enough chunk

As the exponential copy progresses it creates more and more paging as opposed to the sequential method that may not even reach the threshold.

In conclusion, you numbers are too large to get a meaningful comparison because you're running into memory issues that interfere with the results.