17. Vectors in C++

1. Introduction

Vectors are similar to arrays. They can store large amounts of data of the same type. However, unlike arrays, vectors can grow and shrink in size during program execution. As you can expect based on your knowledge of arrays, this re-sizing does not come for free. We'll discuss this later. However, the idea is that the size of a vector does not need to be known in advance.

Here is an example of a problem where a vector is a convenient tool.

Problem 1. The user enters integers one by one. Input ends when the user enters -9. After entering -9, the user provides one more positive integer, q. The program must add q to each of the previously entered integers and print the updated integers in the same order they were received.

You may notice that this is the same problem as Problem 1 from the chapter on linked lists. We solved that problem using a linked list. In this chapter, we will solve it using a vector.

Before we can solve this problem, we need to learn how to use vectors in C++. Vectors are part of the C++ standard library. To use them, we must include the appropriate header file.

#include <vector>

2. Declaring and initializing

A vector is declared by specifying the type of data it will store. The declaration of a vector of long integers is done as follows:

std::vector<long> v;

The above declaration creates an empty vector. The computer is asked to give us a vector that can store values of type long, and we agree to call it v.

It is also possible to declare a vector with a specific number of elements. The following declaration creates a vector of 10 elements of type long:

std::vector<long> v(10);

The elements of the vector are indexed starting from 0. The first element is v[0], the second is v[1], and the last is v[9]. This is the same indexing convention as for arrays.

If we want all elements to be initialized to a specific value, say 0, we can write:

std::vector<long> v(10, 0);

More generally, the declaration std::vector<long> v(n, x); creates a vector of n elements, each initialized to the value x.

Finally, a vector can be initialized from a list of values:

std::vector<long> v = {3, 1, 4, 1, 5, 9};

This creates a vector of 6 elements with the given values. The compiler flag -std=c++11 is required for this syntax, just as it was required for nullptr in the chapter on linked lists.

3. Accessing elements

The elements of a vector can be accessed using the [] operator, in the same way as arrays. The following code declares a vector of 5 elements, sets the first element to 10, and prints it.

std::vector<long> v(5);
v[0] = 10;
std::cout << v[0] << std::endl;

The [] operator does not check whether the index is within the valid range. Accessing an element outside the valid range using [] results in undefined behavior, in the same way that accessing an invalid memory address using a pointer can cause a program to crash or behave unexpectedly.

The .at() method provides an alternative way to access elements. Unlike [], the method .at() checks whether the index is within the valid range. If the index is out of range, the program will terminate with an error message rather than behaving unpredictably.

std::vector<long> v(5);
v.at(0) = 10;
std::cout << v.at(0) << std::endl;

For a beginner, .at() is the safer choice. However, [] is faster, since it skips the range check. We use [] when we are confident that the index is within range, and .at() when we are not.

4. Size, and adding and removing elements

The number of elements currently stored in a vector is called its size. The size of a vector can be obtained using the .size() method.

std::vector<long> v = {3, 1, 4, 1, 5, 9};
std::cout << v.size() << std::endl;

The above code will print 6, since the vector has 6 elements.

A new element can be added to the end of a vector using the .push_back() method. The following code creates an empty vector and adds three elements to it.

std::vector<long> v;
v.push_back(3);
v.push_back(1);
v.push_back(4);
std::cout << v.size() << std::endl;

After the three calls to .push_back(), the vector contains the elements 3, 1, and 4, in that order. The call to .size() will print 3.

The last element of a vector can be removed using the .pop_back() method.

std::vector<long> v = {3, 1, 4, 1, 5, 9};
v.pop_back();
std::cout << v.size() << std::endl;

After the call to .pop_back(), the vector contains 5 elements: 3, 1, 4, 1, and 5. The element 9 has been removed.

We can now solve the problem from the introduction. The solution reads integers from the user using .push_back(), and then prints the updated values.

// vectors.cpp
// compile with
// c++ vectors.cpp -o vectors -std=c++11
// execute with
// ./vectors
#include <iostream>
#include <vector>
int main(){
   std::vector<long> v;
   long userInput;
   std::cin >> userInput;
   while(userInput != -9){
      v.push_back(userInput);
      std::cin >> userInput;
   }
   long q;
   std::cin >> q;
   for(long i = 0; i < v.size(); ++i){
      std::cout << v[i] + q << " ";
   }
   std::cout << std::endl;
   return 0;
}

Notice that unlike the linked list solution, there is no need to allocate or free memory manually. The vector takes care of this on our behalf.

5. Iterating over a vector

In the solution above, we used an index-based loop to iterate over the vector. This is the most straightforward way to iterate over a vector, and it will be the most common approach in this course.

std::vector<long> v = {3, 1, 4, 1, 5, 9};
for(long i = 0; i < v.size(); ++i){
   std::cout << v[i] << " ";
}

C++ also provides a more compact syntax for iterating over a vector, called a range-based for loop. The following code prints the same output as the code above.

std::vector<long> v = {3, 1, 4, 1, 5, 9};
for(long x : v){
   std::cout << x << " ";
}

The range-based for loop declares a variable x that takes the value of each element of the vector in order. The loop body is executed once for each element. Note that x is a copy of the element, not the element itself. Modifying x inside the loop does not modify the vector.

Remark: You may encounter code that iterates over a vector using an iterator, with syntax such as v.begin() and v.end(). Iterators are a more general concept that applies to all standard library containers, including the sets and maps from earlier chapters. We will not use iterators directly in this course, but it is useful to recognize the syntax when reading code written by others.

6. Vectors and memory

A vector is not magic. Internally, a vector stores its elements in a block of memory that is allocated on the heap, in the same way that we have been allocating memory manually using new. When we call .push_back() and the vector needs more space, it requests a new, larger block of memory from the computer, copies all existing elements to the new block, and releases the old block. This is the re-sizing cost that we mentioned in the introduction.

The vector keeps track of two numbers: its size, which is the number of elements currently stored, and its capacity, which is the size of the memory block that has been allocated. The capacity is always at least as large as the size. When the capacity is exceeded, the vector increases its capacity by some amount. The exact policy depends on the implementation. In 2026, when these notes were last updated, the most common compilers gcc and clang would double the capacity when .push_back() is called on a vector whose size equals its capacity. Since the capacity grows geometrically, most calls to .push_back() are fast, but occasionally one call will trigger a re-allocation. The current capacity of a vector can be obtained using the .capacity() method.

std::vector<long> v;
for(long i = 0; i < 10; ++i){
   v.push_back(i);
   std::cout << "size=" << v.size();
   std::cout << " capacity=" << v.capacity() << "\n";
}

We encourage the reader to compile and run the above code and observe how the capacity grows in steps while the size grows by one at each iteration.

When a vector goes out of scope, it automatically releases the memory it has allocated. The programmer has no way to release it manually, nor any reason to. This is the main practical advantage of vectors over manual memory management: the programmer does not need to remember to free the memory, and memory leaks of the kind we discussed in the chapter on pointers cannot occur.

7. Comparing vectors

Vectors support the comparison operator <. Two vectors are compared lexicographically: the first elements are compared first, and if they are equal, the second elements are compared, and so on. If all elements of the shorter vector are equal to the corresponding elements of the longer vector, then the shorter vector is considered smaller.

std::vector<long> a = {1, 2, 3};
std::vector<long> b = {1, 2, 4};
if(a < b){
   std::cout << "a is smaller\n";
}

The above code will print a is smaller, since the first two elements of a and b are equal, and the third element of a is smaller than the third element of b.

This means that vectors can be used as keys in a std::map and ssm::map or stored in a std::set and ssm::set, since both require the < operator to be defined on their elements. This turns out to be useful in dynamic programming, where we sometimes need to map a vector of parameters to a previously computed result.

8. Practice problems

Problem 2.

In this problem, we will implement a building block for a primitive database where each record consists of a key, which is a vector of integers of fixed dimension, and a value, which is a string. The distinguishing feature of this database is that it supports efficient ranking of records by any chosen coordinate of the key. This first problem asks you to implement a struct that will make this possible.

Create appropriate structs and methods that solve the problem presented below. You must use the keyword const whenever possible.

Implement a struct FlaggedVector that wraps a std::vector<long> and contains a distinguished coordinate called flag. The struct should support the following:

  • (a) operator[] (both const and non-const versions) that provides access to the elements of the wrapped vector.
  • (b) size that returns the size of the wrapped vector.
  • (c) resize that resizes the wrapped vector.
  • (d) operator< that compares two FlaggedVector objects. The comparison should sort primarily by the flag-th coordinate, with full lexicographic order as a tiebreaker. If flag is out of range, fall back to lexicographic order directly.

Problem 3.

This problem continues the development of the primitive database introduced in Problem 2.

Create appropriate structs and methods that solve the problem presented below. You must use the keyword const whenever possible.

Implement a struct Record that represents a single entry in the database. It should contain:

  • (a) A key of type std::vector<long>.
  • (b) A value of type std::string.
  • (c) operator< that compares two Record objects lexicographically by key.

Problem 4.

This problem continues the development of the primitive database introduced in Problems 2 and 3. We now implement a basic database without ranking support.

When a search operation fails to find the requested key, it signals failure by returning \(-1\) as the first component of the returned pair.

Create a class MultiRankDB with the following private attributes:

long dim;
ssm::set<Record> recordStorage;

Your class must contain the following methods:

  • (a) format with one argument m of type long. The method resets the database and sets the key dimension to m. If m is negative, the call is ignored. This operation is destructive — all existing records are erased regardless of the previous value of dim.
  • (b) dimension should return the current value of dim.
  • (c) size should return the number of records currently stored.
  • (d) insert with two arguments: key of type const std::vector<long>& and text of type const std::string&. The method inserts a record with the given key and text. If a record with that key already exists, its text is updated. If key.size() != dim, the call is ignored. The cost should be \(O(\log n)\), where \(n\) is the number of records currently stored.
  • (e) erase with one argument key of type const std::vector<long>&. The method removes the record with the given key, if it exists. The cost should be \(O(\log n)\).
  • (f) find with one argument key of type const std::vector<long>&. The method returns a std::pair<long, std::string> where the first component is the rank of the record in recordStorage and the second is its text. If the key is not found, the first component is \(-1\). The cost should be \(O(\log n)\).

Problem 5.

This problem extends the class MultiRankDB from Problem 4 by adding support for ranking records by a single chosen coordinate. When a search operation fails to find the requested key, it signals failure by returning \(-1\) as the first component of the returned pair. When sortAndFind fails, it returns the vector <-1>, by which we mean a vector of length 1 whose single element is \(-1\). This convention is unambiguous because a valid key always has length dim.

Add the following private attribute to MultiRankDB:

std::vector<ssm::set<FlaggedVector>> keyRankings;

The attribute keyRankings is a vector of length dim, where the \(k\)-th component stores all keys sorted primarily by their \(k\)-th coordinate, using FlaggedVector with flag set to \(k\). Update format, insert, and erase accordingly to maintain keyRankings, and add the following methods:

  • (a) findRankK with two arguments: key of type const std::vector<long>& and k of type long. The method returns a std::pair<long, std::string> where the first component is the rank of the record with respect to coordinate \(k\), and the second is its text. If the key is not found or \(k\) is out of range, the first component is \(-1\). The cost should be \(O(\log n)\).
  • (b) sortAndFind with two arguments k and j of type long. The method returns a std::pair<std::vector<long>, std::string> where the first component is the key of the \(j\)-th record in the \(k\)-coordinate ordering, and the second is its text. If \(k\) is outside \([0, \mathtt{dimension}()-1]\) or \(j\) is outside \([0, \mathtt{size}()-1]\), the first component is <-1>.

Problem 6.

This problem extends the class MultiRankDB from Problem 5 by adding support for retrieving the rankings of a record with respect to all coordinates simultaneously. When the key is not found, the method signals failure by returning the vector <-1>, by which we mean a vector of length 1 whose single element is \(-1\). This convention is unambiguous because a valid ranking vector always has length dim.

Add the following method to MultiRankDB:

  • (a) findAllRanks with one argument key of type const std::vector<long>&. The method returns a std::pair<std::vector<long>, std::string> where the first component is a vector of length dim containing the rank of the record with respect to each coordinate, and the second is its text. If the key is not found, the first component is <-1>. The cost should be \(O(\mathtt{dim} \cdot \log n)\).

Problem 7.

This problem concludes the development of the primitive database built in Problems 2--6.

Write a main function that tests and exercises all aspects of the database, including formatting, inserting, erasing, and all search operations: find, findRankK, findAllRanks, and sortAndFind.

Problem 8. The first two elements of user input are positive integers m and n. After m and n are received from the input, the user provides a positive integer k that is strictly smaller than \(n\) and a vector called forbidden vector whose length is \(k\) and whose all terms are from the set \(\{0\), \(1\), \(\dots\), \(m-1\}\). Create a recursive algorithm (a dynamic programming algorithm will be asked in the next problem) that counts all sequences of length \(n\) whose elements are from the set \(\{0\), \(1\), \(\dots\), \(m-1\}\) and that do not contain the forbidden vector as a contiguous subarray.

Problem 9. Use memoization to improve the efficiency of the solution of problem 8.