(C++) Performance traps of std::vector and how to avoid them

For most C++ developers, std::vector is the default choice among STL containers.
Thanks to its contiguous memory layout, it is highly cache-friendly and offers fast O(1) random access.

However, using std::vector without understanding its internal mechanics can lead to unexpected costs. You might face performance degradation from memory reallocations and element shifting, or worse, your program could fall into the abyss of undefined behavior due to iterator invalidation.

This post will explore the internal behaviors of std::vector and show you how to leverage them to write faster, more robust code.

1. Memory Reallocation

STL containers are designed to grow automatically to accommodate new elements, freeing the programmer from manual memory management. For std::vector and std::string, this growth is achieved through a process known as reallocation.

Reallocation involves the following four steps. (Note: capacity refers to the maximum number of elements the container can store without needing to reallocate memory.)

When an element is added to a container that is already at full capacity, a new, larger block of memory is allocated. (Typically, the capacity increases to a power of two.)
All existing elements are moved (or copied, for older C++ standards or non-movable types) from the old memory block to the new one.
The original elements in the old block are destroyed (their destructors are called).
The old memory block is deallocated.

To avoid these reallocation costs, you can call the vector::reserve member function. Calling reserve(n) ensures the container's capacity is set to hold at least n elements.

There are two primary techniques for using reserve:

If you know the exact or approximate number of elements you plan to store, call reserve upfront before you begin inserting them.

const int num_elem = 1000;
std::vector<int> vec;

// Reserve memory upfront to prevent reallocations
vec.reserve(num_elem);

for (int i = 0; i < num_elem; ++i) {
    vec.push_back(i); // No reallocations will occur in this loop
}

If the final count is unknown, you can adopt another strategy:
1. First, reserve a generous, pre-estimated maximum capacity.
2. Once all the data has been inserted, call the shrink_to_fit member function to release the excess, unused memory.

 const int max_possible_elem = 1000;
 std::vector<int> vec;

 // Reserve a generous amount of memory
 vec.reserve(max_possible_elem);

// Insert a variable number of elements (e.g., based on runtime conditions)
int elem_to_add = rand() % max_possible_elem;
for (int i = 0; i < elem_to_add; ++i) {
    vec.push_back(i);
}

vec.shrink_to_fit();

2. Element Shifting

Both std::vector and std::string store their elements in a single, contiguous chunk of dynamically allocated memory.

Consequently, when a new element is inserted or an existing one is erased, all elements following that position must be shifted forward or backward. This "shifting" operation has a time complexity of O(n), which can significantly impact your program's performance.

2-1. What about the shifting cost in std::deque?

A std::deque is typically implemented as a collection of individually allocated, fixed-size arrays (often called memory blocks or chunks). While element shifting still occurs upon insertion or erasure, it is confined to the specific memory block(s) where the operation takes place.

2-2. Erasing an element in O(1) when order doesn't matter

As we've discussed, erasing an element from the middle of a std::vector is an O(n) operation. This is because every element after the erased position must be shifted forward.

However, if you don't need to preserve the order of elements in the vector, you can erase a middle element in O(1) time with the following technique:

template<typename T>
void quick_remove_at(std::vector<T> &v, size_t idx)
{
    if (idx < v.size()){
        v[idx] = std::move(v.back());
        v.pop_back();
    }
}

2-3. How does the erase-remove idiom ensure O(n) time complexity?

When you need to delete elements from a vector that meet a specific condition, the erase-remove idiom is the canonical way to do it:

vector<int> vec {1,2,3,99,5,99,7,8,9,99};
vec.erase(std::remove(v.begin(), v.end(), 99), vec.end());

The time complexity of the erase-remove idiom is O(n). How is this possible?

The key is that std::remove doesn't actually remove any elements from the container. (Notice that std::remove is a generic algorithm that only takes iterators as parameters; it has no knowledge of the container itself. The actual removal is performed by the container's member function, vector::erase.)

std::remove works by overwriting elements to be removed with the next element that should be kept.

Let's trace its execution on the example vector. In the table below, values that need to be overwritten are marked in bold, and changes are shown from top to bottom as the operation progresses.

v[0]	v[1]	v[2]	v[3]	v[4]	v[5]	v[6]	v[7]	v[8]	[v9]
1	2	3	99	5	99	7	8	9	99
1	2	3	99	5	99	7	8	9	99
1	2	3	5	5	99	7	8	9	99
1	2	3	5	5	99	7	8	9	99
1	2	3	5	7	99	7	8	9	99
1	2	3	5	7	8	7	8	9	99
1	2	3	5	7	8	9	8	9	99
1	2	3	5	7	8	9	8	9	99

After std::remove completes its O(n) pass, the elements to be removed are now located at the end of the range. std::remove returns an iterator to the first of these "garbage" elements (the new logical end). This is the iterator that vector::erase uses to perform a final deletion.

Similarly, std::unique also operates inO(n) time. It's crucial to remember that std::unique only removes consecutive duplicate elements**.** For this reason, you must sort the elements beforehand to ensure that all identical items are grouped together before you call it.

References

C++ reference for std::vector
- en.cppreference.com/w/cpp/container/vector
C++ reference for std::unique
- https://en.cppreference.com/w/cpp/algorithm/unique
# Inside STL: The deque, design
Effective STL by Scott Meyers (Addison-Wesley Professional, 2001)
C++ 17 STL Cookbook by Jacek Galowicz (acorn+PACKT 2019)

(C++) Performance traps of std::vector and how to avoid them

1. Memory Reallocation

2. Element Shifting

2-1. What about the shifting cost in std::deque?

2-2. Erasing an element in O(1) when order doesn't matter

2-3. How does the erase-remove idiom ensure O(n) time complexity?

References

Comments

More from this blog

Unit Testing: Why It Matters

(Optimization) Lazy evaluation : Postpone until the result is actually needed

(C++) Iterator invalidation : The silent killer in C++ loops

(C++) Why should we be conservative about using std::set and std::map?

Command Palette

1. Memory Reallocation

2. Element Shifting

2-1. What about the shifting cost in std::deque?

2-2. Erasing an element in O(1) when order doesn't matter

2-3. How does the erase-remove idiom ensure O(n) time complexity?

References

Comments

More from this blog