Search code examples
c++qtforeachcopy-on-write

What is a "container detachement" in the C++11 ranged based loop over QList? Is it a performance only problem?


This question contains some proposals for working around the problem, I would like to understand more in depth that exactly the problem is:

 QList<QString> q;
 for (QString &x: q) { .. }
  1. Is it so that unless the container is declared const, Qt makes a copy of the list and then iterates over that copy? This is not among the best, but would be bearable if the list is small (say 10-20 QString's).
  2. Is it performance only problem or it can be some deeper problem? Let's assume we do not add/remove elements while the loop is running.
  3. Is the modification of the value in the loop (assuming it is a reference) something that still works or it is fundamentally broken?

Solution

  • Copy-on-write (=Implicit sharing) concept

    It is important to understand that copy-on-write (= implicit shared) classes externally behaves like "normal" classes that perform a deep copy of their data. They only postpone this (potentially) expensive copy operation as long as possible. A deep copy is made (=detaching), only if the following sequence occurs:

    • The list is implicitly shared, i.e. the object is copied by value (and at least 2 instances still exist)
    • A non-const member function is accessed on the implicitly shared object.

    Your questions

    1. Only if the container is shared (by another copy on write instance of this list), a copy of the list will be made (as a non-const member is invoked on the list object). Note that the C++ range loop is just a short hand for a normal iterator based for loop (see [1] for the exact equivalence which depends on the exactly used C++ version):

      for (QList<QString>::iterator& it = q.begin(); x != q.end(); ++it)
      {
          QString &x = *it;
          ...
      }
      

      Note that the begin method is a const member function if and only if the list q itself is declared const. If you would write it in full yourself, you should use constBegin and constEnd instead.

      So,

      QList<QString> q;
      q.resize(10);
      QList<QString>& q2 = q; // holds a reference to the same list instance. Modifying q, also modifies q2.
      for (QString &x: q) { .. }
      

      doesn't perform any copy, as list q isn't implicitly shared with another instance.

      However,

      QList<QString> q;
      q.resize(10);
      QList<QString> q2 = q; // Copy-on-write: Now q and q2 are implicitly shared. Modifying q, doesn't modify q2. Currently, no copy is made yet.
      for (QString &x: q) { .. }
      

      does make a copy of the data.

    2. This is mostly a performance issue. Only if the list contain some special type with a weird copy constructor/operator, this may be not the case, but this would probably indicate a bad design. In rare cases, you may also encounter the Implicit sharing iterator problem, by detaching (i.e. deep copy) a list when an iterator is still active.

      Therefore, it is good practice to avoid unneeded copies in all circumstances by writing:

      QList<QString> q = ...;
      for (QString &x: qAsConst(q)) { .. }
      

      or

      const QList<QString> q = ...;
      for (QString &x: q) { .. }
      
    3. Modifications in the loop aren't broken and work as expected, i.e. they behave as if the QList doesn't use implicit sharing, but performs a deep copy during a copy constructor/operator. For example,

      QList<QString> q;
      q.resize(10);
      QList<QString>& q2 = q;
      QList<QString> q3 = q;
      for (QString &x: q) {x = "TEST";}
      

      q and q2 are identical, all containing 10 times "TEST". q3 is a different list, containing 10 empty (null) strings.

    Also check the Qt documentation itself about Implicit Sharing, which is used extensively by Qt. In modern C++, this performance optimization construct could be (partially) replaced by newly introduced move concept.

    Improve understanding by inspecting source code

    Every non-const function calls detach, before actually modifying the data, f.ex. [2]:

    inline iterator begin() { detach(); return reinterpret_cast<Node *>(p.begin()); }
    inline const_iterator begin() const noexcept { return reinterpret_cast<Node *>(p.begin()); }
    inline const_iterator constBegin() const noexcept { return reinterpret_cast<Node *>(p.begin()); } 
    

    However, detach only effectively detaches/deep copy the data when the list is actually shared [3]:

    inline void detach() { if (d->ref.isShared()) detach_helper(); }
    

    and isShared is implemented as follows [4]:

    bool isShared() const noexcept
    {
        int count = atomic.loadRelaxed();
        return (count != 1) && (count != 0);
    }
    

    i.e. more than 1 copy (= another copy except for the object itself) exists.