Search code examples
c++optimizationvectordynamic-allocation

Optimizing unnecessary string copying in vector<string>


Presenting the minimal code to describe the problem:

struct A {
  vector<string> v;
  // ... other data and methods
};
A obj;
ifstream file("some_file.txt");
char buffer[BIG_SIZE];
while( <big loop> ) {
  file.getline(buffer, BIG_SIZE-1);
  // process buffer; which may change its size
  obj.v.push_back(buffer);  // <------- can be optimized ??
}
...

Here 2 times string creation happens; 1st time to create the actual string object and 2nd time while copy constructing it for the vector. Demo

The push_back() operation happens millions of times and I am paying for one extra allocation those many times which is of no use for me.

Is there a way to optimize this ? I am open for any suitable change. (not categorizing this as premature optimization because push_back() happens so many times throughout the code).


Solution

  • Well, you get two allocations, but not both of them are of the string: one of them creates the string, while the other creates just a pointer inside of the vector (note that this depends on the compiler: some compilers/settings might indeed create two strings, but most won't). Look at this code for the demo.

    One way to optimize it would be using the char* instead of the string as the template parameter (don't forget to manually delete it before killing the vector!). This way you'll get rid of one (biggest) of the allocations. Alternatively, just use your own implementation of vector: you'll be able to control every aspect of memory allocation then.