Search code examples
c++listiteratorpush-back

List items and iterators: do either push back or iterators create copies of original data?


Though I've a lot of experience as C Embedded Programmer, I'm actually not so skilled in C++. I'm implementing a parser in which ItemsToBeFound are pushed within a list.

It happens that some of these items (let's say master items) carry more information, so that some other items (slave items) need to refer to master items to retrieve those info. This link is done at population phase, without iterators.

This is the sequence of the actions:

  1. I save the pointer to the master item, and push it to the list
  2. I set specific pointer field pReferenceItem of the slave item so that it points to the master item
  3. Later, I start the iteration through the list
  4. I save parsed information to master item
  5. When it comes to process slave item, I try accessing master item's field. But it is unchanged!
  6. The pointer to master item during iteration is different from the original one, so the pointer contained in slave item points to the original (unchanged) data

Any suggestion about some special qualifiers I could use in order to force the iteration through the original data? Any obvious mistake in the way I issue lists and/or iterators?

Please note that I'm sure about what asserted in step #6 above, because I've watched pointers values using the debugger.

Here's the simplified code showing the described issue:

#include <list>
#include <iostream>

std::list<ItemToBeFound> myList;

class ItemToBeFound
{
private:
public:
    /* ... */
    int specialDataToBeRetreived;
    ItemToBeFound *pReferenceItem;
    /* ... */

    ItemToBeFound( ItemToBeFound *ref )
    {
        pReferenceItem = ref;
        specialDataToBeRetreived = 777;
    }

    bool isMaster( void )
    {
        return ( pReferenceItem == NULL );
    }
}


void PopulateList( void )
{
    ItemToBeFound *masterItem = new ItemToBeFound( NULL /* no refItem */ );
    myList.push_back( *masterItem );

    ItemToBeFound *slaveItem = new ItemToBeFound( masterItem );
    myList.push_back( *slaveItem );
}

int main( void )
{
    // ...
    PopulateList();
    // ...

    /* Let's iterate and do something */
    std::list<ItemToBeFound>::iterator it = myList.begin();
    while( it != itemsList.end() )
    {
        if( it->isMaster() )
        {
            it->specialDataToBeRetreived = 42; /* In the real program I get the value after a parsing activity */
        }
        else /* it is a slave item */
        {
            it->specialDataToBeRetreived = it->referenceItem->specialDataToBeRetreived;

            /* But it doesn't work! */
            std::cout << "Updated special value: " << it->specialDataToBeRetreived << std::endl;
        }

        it++;
    }

    // ...

    return 0;
}

In the "slave" branch of the conditional, during iteration, I expect the specialDataToBeRetreived I set in reference object, but I keep obtaining 777 (the example value I set in the constructor):

Updated special value: 777

And the reason why I get this value is that, as shown by the debugger, the reference item in iteration phase has an address different from the original one, pointed by the slave item.


The questions

  • Why did this change occur?
  • I was expecting that the pointer to my object was just copied to the list, like in many list implementation I've seen in my life. Does the push_back method of the std::list template perform a copy of the pointed object?

Solution

  • Iterators do not copy anything. In C terms, they behave almost like pointers within array.

    However, you do copy elements here:

    void PopulateList( void )
    {
      ItemToBeFound *masterItem = new ItemToBeFound( NULL /* no refItem */ );
      myList.push_back( *masterItem ); //copy of masterItem pointee
    
      ItemToBeFound *slaveItem = new ItemToBeFound( masterItem ); //initialized with masterItem, which does not point to the object stored in the list!
      myList.push_back( *slaveItem ); //copy of slaveItem pointee, also stores masterItem, but not object from the list
    
    //slaveItem is leaked here, masterItem is only kept via slaveItem copy in the list
    }
    

    If you initialize objects like this, your slave objects never store pointers to objects in the list, but rather to the objects created with new.


    Let's go step by step through the code (I'll be a rubber debugger here):

    ItemToBeFound *masterItem = new ItemToBeFound( NULL /* no refItem */ );
    

    You allocate an object on the heap (they don't have names, but let's call it m1) and a pointer on the stack called masterItem, so derefenced masterItem is exactly m1.

      myList.push_back( *masterItem );
    

    std::list::push_back() creates a copy of the argument given to it (I ignore move semantics - they are not used here). Notice what is the argument - dereferenced masterItem or m1 as we called it before.
    The copy of the object m1 created by push_back will be called m2

    So now we have the following (with a->b meaning a points to b):

    masterItem->m1
    myList->m2
    

    Next line:

      ItemToBeFound *slaveItem = new ItemToBeFound( masterItem );
    

    Here, you initialize slaveItem with new object on the heap - s1.

    masterItem->m1
    myList->m2
    slaveItem->s1->masterItem->m1 (s1 points to masterItem, which points to m1)
    

    Last one:

    myList.push_back( *slaveItem );
    

    Again, push_back() creates a copy of its argument on the heap (this copy will be called s2)

    masterItem->m1
    slaveItem->s1->masterItem->m1 
    myList->m2
    myList->s2->masterItem->m1
    

    Eventually, you have:

    • 4 objects on the heap: m1, m2, s1 and s2 (all of type ItemToBeFound)
    • 2 objects on the stack: masterItem and slaveItem (both of pointer type)
    • one object in the global area: myList (can be treated as stack object here)
    • m2 and s2 are stored in myList and can be iterated over
    • Both s1 and s2 have pointers to m1
    • No slave object points to m2

    Later on, when you run mainFunction(), object m2 gets filled with data, but object s2 is trying to retrieve data from m1, which wasn't filled with any valueable data.


    It's hard to propose any solution without more info. One I'd use if I couldn't reorganize class structure would be to use std::list<std::shared_ptr<ItemToBeFound>> instead, which will keep your pointers safe, and only store std::weak_ptr<ItemToBeFound> referenceItem in slave objects (to avoid any circular dependecies and declare ownership clearly).

    You could have std::list<ItemToBeFound*> and manage memory manually, but that's against modern C++ spirit and should be used with great care - flow can take more branches in C++ than in C and it's more difficult to track all of them to delete everything.

    You could technically store in slave objects pointers to objects stored in the list (or even iterators to those), but that feels like a weird solution. You don't control those objects and you don't own them, so you shouldn't keep pointers to those.