Search code examples
dictionarypass-by-referenceswift5computed-properties

Is it perfectly correct that this computed property will work like a "pointer" in current Swift?


Have a data source singleton with

class hugestuff ..{

   var dataSources: [String: ThingWithIncrediblyLargeArrays] ..

   var currentThing: String

}

ThingWithIncrediblyLargeArrays is indeed a struct. A typical use might be enormous tables of values to display in a grid on a screen. Each ThingWithIncrediblyLargeArrays - that is to say each entry in the .dataSources dictionary - could be a different screen, say.

struct ThingWithIncrediblyLargeArrays {
   var rawUnsorted: [[String]]
   var liveSorted: [[String]]
   var swapValues: [[String]]
}

So throughout the app, you're accessing

  cell.data = hugestuff.dataSources[currentThing].unsortedRows[3489412]

and so on.

And indeed sometimes changing things ...

  hugestuff.dataSources[currentThing].searchRows =
       hugestuff.dataSources[currentThing].unsortedRows.filter{blah}

This is all fantastic but purely as sugar it would be more convenient and sturdy if

  cell.data = hugestuff.hot.unsortedRows[3489412]

  hugestuff.hot.searchRows = hugestuff.hot.unsortedRows.filter{blah}

Well now, I believe that to achieve that, if you do this

var hot: ThingWithIncrediblyLargeArrays {
    get { dataSources[currentThing] }
    set { dataSources[currentThing] = newValue }
}

it all works.

In other words, the Way to do a "pointer or macro like thing", in this situation with a dictionary in Swift, is as above.

In fact, does that safely and correctly works in the intended way? To wit,

(A)

x = hugestuff.hot.enormousArray[13]

will not pointlessly copy the array during the process and

(B)

hugestuff.hot.searchRows = .. some new or modified array

will indeed "actually change" that array, i.e., it won't just instead incorrectly change some transient copy of the array somewhere.

To repeat, in (B),

hugestuff.hot.searchRows = hugestuff.hot.other.sortwhatever

Is hoping to be the same as

hugestuff.dataSources["electrons"].searchRows = 
   hugestuff.dataSources["electrons"].other.sortwhatever

In other words I am hoping that hugestuff.hot.searchRows = .. in fact actually changes hugestuff.dataSources["electrons"]

Is this perfectly correct?

I mention "current" Swift because, of course, Swift dramatically changed how arrays work at one point, and, Swift uses the ingenious ("But ask on SO before you try anything tricky") copy-only-on-need approach for big arrays (or indeed any arrays).


BTW here is a fantastically useful answer to a related question Is it absolutely the case that Swift >will not< deep copy a large array, when one "guard let" the array?


Coda, just for anyone browsing this, as well as the pure macro-like syntax sugar aspect ("it is much quicker to just type .hot"), a really useful thing is you can ..

var hot: .. {
    get { dataSources[currentThing] ?? some empty value }

If dataSources["sales"] say does not exist yet (for whatever reason: whether by design, eg, you are building them lazily, streaming, processing on other cores, whatever, or indeed just a plain programming mistake) you can handle all that in there.


Solution

  • If ThingWithIncrediblyLargeArrays is a class, hot will trivially act like a "pointer". After all, classes are reference types, and hot will return a reference.

    From now on I will assume ThingWithIncrediblyLargeArrays is a struct.

    Array is copy-on-write. For the array storage to be copied, it has to

    In both cases A and B, hot's getter returns a copy of ThingWithIncrediblyLargeArrays, but the arrays in the copies will share the same storage, because you are not writing to hugestuff.hot.enormousArray.

    However, hot is not completely like a pointer. If you do:

    hugeStuff.hot.enormousArray[someIndex] = something
    

    hot returns a copy of ThingWithIncrediblyLargeArrays, which shares the array storage with dataSources[currentThing]. Then you modified the enormousArray of the copy returned by hot. Now the array storage cannot be shared, so the whole array storage will be copied.

    You can think of this as:

    let temp = hugeStuff.hot
    temp.enormousArray[someIndex] = something
    hugeStuff.hot = temp
    

    The array storage is not uniquely referenced here - temp and huge.dataSources[someIndex] share the storage.

    A good way to determine whether array storages are copied is to use the "Allocations" tool in Instruments. Try it with this code:

    struct ThingWithIncrediblyLargeArrays {
        var enormousArray = Array(repeating: 1, count: 1000)
        var searchRows = ["Row1", "Row2", "Row3"]
    }
    
    class HugeStuff {
        var dataSources = ["Foo": ThingWithIncrediblyLargeArrays()]
        var currentThing = "Foo"
        
        var hot: ThingWithIncrediblyLargeArrays {
            get { dataSources[currentThing]! }
            set { dataSources[currentThing] = newValue }
        }
    }
    
    let hugeStuff = HugeStuff()
    // run each of these loops separately:
    
    // case A
    //for i in 0..<1000 {
    //    let x = hugeStuff.hot.enormousArray[i]
    //    print(x)
    //}
    
    // case B
    //for i in 0..<1000 {
    //    hugeStuff.hot.searchRows = ["Row \(i)"]
    //    print("x")
    //}
    
    // case C
    //for i in 0..<1000 {
    //    hugeStuff.hot.enormousArray[i] = i
    //    print("x")
    //}
    

    You can filter by "array", and you will see something like this in Instruments for case C:

    enter image description here

    The name _TtGCs23_ContiguousArrayStorageSi_ is the mangled name for ContiguousArrayStorage<Int>. You can demangle these using the swift demangle command.