Sets and Maps Simplified

1. Introduction

Sets and maps are used for big quantities of data. The fundamental operations on sets and maps are very fast, even when their sizes are large. If \(N\) denotes the total number of elements, then then the fundamental operations have complexity \(O(\log N)\).

1.1. ssm vs std

This document will introduce simplified sets and maps that are called ssm::set and ssm::map. The standard C++ has data structures std::set and std::map that will also be covered in these notes. However, they are more difficult to use and will be introduced after ssm::set and ssm::map.

Simplified sets and maps are easier to use and easier to learn. Simplified sets and maps also have an additional function that standard sets and maps don't have. This function allows us to search for the \(k\)-th smallest element. The search of the \(k\)-th smallest element has complexity \(O(\log N)\).

The simplified sets and maps are not part of the standard C++. They are open source, however, and you can download the necessary code from this website.

2. Simple set (ssm::set)

We will first learn how to use ssm::set.

2.1. Download

Create the file simple_set_and_map.cpp on your own computer. In the new file, you should put the code from the following link:

Download link: ssm set and map

2.2. Operations on ssm::set

In the beginning of our program, we need to add the directive:

#include "simple_set_and_map.cpp"

2.2.1. Declaration

A set of integers myNums can be declared in the function main with the command

ssm::set<long> myNums;

2.2.2. Insert

The elements 17, -19, 5, 11, 9, and 51 are inserted with the commands

myNums.insert(17); myNums.insert(-19); myNums.insert(5);
myNums.insert(11); myNums.insert(9); myNums.insert(51);

2.2.3. Size

The total number of elements is obtained with the method size().

std::cout<<myNums.size()<<"\n";

Remark. If you try to insert an element that is already in the set, then nothing will happen. If we insert 17 again, and try to print the size of the set, we will see that the size remains 6.

myNums.insert(17); std::cout<<myNums.size()<<"\n";

2.2.4. Erase

The command myNums.erase(-19); will remove the element -19 from the set. If the element was not in the set, then the method erase() would do nothing. The following code will remove the element -19 and print the new size, which is 5.

myNums.erase(-19); std::cout<<myNums.size()<<"\n";

2.2.5. \(k\)-th smallest element

The command std::cout<<myNums[0]<<"\n" prints the smallest element of the set. The element myNums[1] is the second smallest, etc. Since our set has 5 elements, the largest element is myNums[4] and the second largest is myNums[3].

long secondLargest=myNums[3]; std::cout<<secondLargest<<"\n";

2.2.6. Printing all elements

All elements of the set can be printed with the following loop.

for(long i=0;i<myNums.size();++i){
   std::cout<<myNums[i]<<" ";
}
std::cout<<"\n";

2.2.7. Search

If x is an integer, then myNums.find(x) returns the number of elements in the set that are strictly smaller than x. In other, words myNums.find(x) returns the position of the element x. In the case that x is not in the set, then myNums.find(x) returns -1.

long x,i;
x=17; i=myNums.find(x);
std::cout<<"The index of "<<x;
std::cout<<" is "<<i<<"\n";
x=27; i=myNums.find(x);
std::cout<<"The index of "<<x;
std::cout<<" is "<<i<<"\n";

2.2.8. Program with all commands

The code below summarizes all the function that we introduced.

#include<iostream>
#include "simple_set_and_map.cpp"
int main(){
   ssm::set<long> myNums;
   myNums.insert(17); myNums.insert(-19); myNums.insert(5);
   myNums.insert(11); myNums.insert(9); myNums.insert(51);
   std::cout<<myNums.size()<<"\n";
   myNums.insert(17); std::cout<<myNums.size()<<"\n";
   myNums.erase(-19); std::cout<<myNums.size()<<"\n";
   long secondLargest=myNums[3]; std::cout<<secondLargest<<"\n";
   for(long i=0;i<myNums.size();++i){
      std::cout<<myNums[i]<<" ";
   }
   std::cout<<"\n";
   long x,i;
   x=17; i=myNums.find(x);
   std::cout<<"The index of "<<x;
   std::cout<<" is "<<i<<"\n";
   x=27; i=myNums.find(x);
   std::cout<<"The index of "<<x;
   std::cout<<" is "<<i<<"\n";
   return 0;
}

2.3. Comparison operator

The elements of the sets do not have to be integers. We can declare a set of strings with the command

ssm::set<std::string> myWords;

long and std::string can be replaced by other classes in the declaration of ssm::set. We can have the declaration

ssm::set<CustomClass> mySet;

as long as the class or structure CustomClass has the comparison operator called operator< whose declaration is

int operator<(const CustomClass& ) const;

The comparison operator is necessary because the set is implemented using balanced binary search trees. The implementation of the tree would is possible only if the comparison operator exists.

2.4. Practice problems

Problem 1.

The user input consists of positive integers. A negative number or 0 is the sign that the input is over. Create the program that prints the total number of different integers that the user has entered. Your code should replace the text // ??? // below.

#include<iostream>
#include "simple_set_and_map.cpp"
int main(){
   ssm::set<long> allNumbers;
   long x;
   std::cin>>x;
   while(x>0){
      allNumbers.insert(x);
      std::cin>>x;
   }
   // ??? //
   return 0;
}

Problem 2.

The user input starts with red numbers. The red numbers are positive integers. When the user inserts 0 or a negative number, the user signals that the input of the red numbers is over. After the red numbers, the user inserts the green numbers. The green numbers are also positive integers. A negative number or 0 is the signal that the input of the green numbers is over. Create a C++ program that counts the total number of red numbers that are not green.

Problem 3.

The input consists of an array of words that may contain repetitions. The word endOfInput denotes the end of input array and is not to be considered a member of the array. The array is followed by a single word v. Write a program whose output is Yes or No depending on whether \(v\) is a member of the input array.

Your code should replace the text // ??? // below.

#include<iostream>
#include "simple_set_and_map.cpp"
int main(){
   ssm::set<std::string> allWords;
   std::string wordFromUser;
   std::cin>>wordFromUser;
   while(wordFromUser!="endOfInput"){
      allWords.insert(wordFromUser);
      std::cin>>wordFromUser;
   }
   std::string v;
   std::cin>>v;
   // ??? //
   return 0;
}

Problem 4.

The user input consists of positive integers. A negative number or \(0\) is a sign that the input is over. Create the program that reads the numbers from the user input and counts the total number of elements from the input that appear odd number of times.

Problem 5.

The user input consists of positive integers. A negative number or \(0\) is a sign that the input is over. Create the program that reads the numbers from the user input and prints all elements from the input that appear odd number of times.

3. Simple map (ssm::map)

Map is another data structure based on balanced search trees. The elements of maps are pairs. The first component is key and the second component is value. Keys must be unique. Keys and values do not have to be of the same data type. The datatype for keys must have the comparison operator called operator<.

In order to use simple maps, we need to add the following line to the beginning of our program.

#include "simple_set_and_map.cpp"

3.1. Operations (ssm::map)

3.1.1. Declaration

We will illustrate the maps on one example. The keys of the map will be names of the mountains. The values will be the corresponding heights. We declare the map mHeights in the function main() with the command

ssm::map<std::string,double> mHeights;

3.1.2. Insert

We will insert several elements with the commands

mHeights.insert("Mont Blanc", 4807.81); mHeights.insert("Olympus Mons", 21900); 
mHeights.insert("Mount Everest", 8848.86); mHeights.insert("Mauna Loa", 4169); 
mHeights.insert("Denali", 6190); mHeights.insert("Aconcagua", 6960.8);

3.1.3. Size

The total number of elements is obtained with the method size().

std::cout<<mHeights.size()<<"\n";

Remark. Nothing will happen if we try to insert an element that has the same key as an element that is already in the map. If we try to insert ("Aconcagua", 5000), then the old height 6960.8 will remain. The size remains 6.

mHeights.insert("Aconcagua", 5000); std::cout<<mHeights.size()<<"\n";

3.1.4. Erase

The command mHeights.erase("Mont Blanc"); will remove the element ("Mont Blanc", 4807.81) from the mapt. If the element was not in the set, then the method erase() would do nothing. The following code will remove the element ("Mont Blanc", 4807.81) and print the new size, which will become 5.

mHeights.erase("Mont Blanc"); std::cout<<mHeights.size()<<"\n";

3.1.5. \(k\)-th smallest element

For an integer \(k\in\{0\), \(1\), \(\dots\), mHeights.size()\(-1\}\), the function call mHeights[k] returns a pair. The first component is the key and the second component is the value of the \(k\)-th smallest element. The \(k\)-th smallest element is defined as the element for which there are exactly \(k\) elements with smaller keys. We can store the value of the method mHeights[k] in a variable of type std::pair<std::string,double>. This is an example of how the method is used.

std::pair<std::string,double> pos3=mHeights[3]; 
std::cout<<pos3.first<<" "<<pos3.second<<"\n";

3.1.6. Printing all elements

All elements of the map can be printed in alphabetical order with the following loop.

for(long i=0;i<mHeights.size();++i){
   std::cout<<mHeights[i].first<<" "<<mHeights[i].second<<"\n";
}
std::cout<<"\n";

The previous loop has one inefficiency. The method mHeights[i] has complexity \(O(\log N)\). Although, this is not a huge number, there is no need to call the method twice. Each call returns both the key and the value. Yet, in the previous listing we called the method twice. The first time we read the key, the second time we read the value. A more efficient program would call the method only once and remember both key and value. This is the improved program.

for(long i=0;i<mHeights.size();++i){
   std::pair<std::string,double> kvPair=mHeights[i];
   std::cout<<kvPair.first<<" "<<kvPair.second<<"\n";
} 

3.1.7. Search

If x is a string, then mHeights.find(x) returns the number of elements in the map whose keys are strictly smaller than x. In other, words mHeights.find(x) returns the position of the element x. In the case that x is not in the map, then mHeights.find(x) returns -1.

std::string x; long i;
x="Olympus Mons"; i=mHeights.find(x);
std::cout<<"The index of "<<x;
std::cout<<" is "<<i<<"\n";
x="Kilimanjaro"; i=mHeights.find(x);
std::cout<<"The index of "<<x;
std::cout<<" is "<<i<<"\n";

If we wanted to search for the mountain and print its height, then there are two ways: An inefficient way and an efficient way.

std::cout<<"Inefficient search for the value\n";
x="Olympus Mons"; i=mHeights.find(x);
std::cout<<"The height of "<<x<<" is ";
std::cout<< mHeights[i].second<<"\n";

The previous code called two methods of complexity \(O(n\log n)\). The command i=mHeights.find(x); called a method of complexity \(O(n\log n)\). The second method of the same complexity was called with mHeights[i].second. However, while executing the method find(x), the program already had an easy way to see the height of the mountain. The problem is, the return value of the method find(x) contained only the index, i.e. the position of the mountain x in the map. There is an alternative to the method find(). The alternative method is called findIKV, which stands for Index, Key, Value. This alternative method will return the index (just as the regular find), as well as the key and the value of the object. The return value has 3 components. These three components are stored using the data structure std::pair. The pair has 2 components, but the function somehow fit 3 values. Well, the cost is that the pair is ugly. The first component of the return pair is the index, the second component is another pair. This second pair has the key and the value. This is the improved code.

std::cout<<"Efficient search for the value\n";
x="Olympus Mons"; 
std::pair<long,std::pair<std::string,double> > ikv;
ikv=mHeights.findIKV(x);
std::cout<<"The index of "<<x<<" is ";
std::cout<< ikv.first<<"\n";
std::cout<<"The key of "<<x<<" is ";
std::cout<<ikv.second.first<<"\n";
std::cout<<"The value of "<<x<<" is ";
std::cout<<ikv.second.second<<"\n";

3.1.8. Program with all commands

The code below summarizes all the function that we introduced.

#include<iostream>
#include "simple_set_and_map.cpp"
int main(){
   ssm::map<std::string,double> mHeights;
   mHeights.insert("Mont Blanc", 4807.81); mHeights.insert("Olympus Mons", 21900); 
   mHeights.insert("Mount Everest", 8848.86); mHeights.insert("Mauna Loa", 4169); 
   mHeights.insert("Denali", 6190); mHeights.insert("Aconcagua", 6960.8);
   std::cout<<mHeights.size()<<"\n";
   mHeights.insert("Aconcagua", 5000); std::cout<<mHeights.size()<<"\n";
   mHeights.erase("Mont Blanc"); std::cout<<mHeights.size()<<"\n";
   std::pair<std::string,double> pos3=mHeights[3]; 
   std::cout<<pos3.first<<" "<<pos3.second<<"\n";
   for(long i=0;i<mHeights.size();++i){
      std::pair<std::string,double> kvPair=mHeights[i];
      std::cout<<kvPair.first<<" "<<kvPair.second<<"\n";
   } 
   std::string x; long i;
   x="Olympus Mons"; i=mHeights.find(x);
   std::cout<<"The index of "<<x;
   std::cout<<" is "<<i<<"\n";
   x="Kilimanjaro"; i=mHeights.find(x);
   std::cout<<"The index of "<<x;
   std::cout<<" is "<<i<<"\n";
   std::cout<<"Inefficient search for the value\n";
   x="Olympus Mons"; i=mHeights.find(x);
   std::cout<<"The height of "<<x<<" is ";
   std::cout<< mHeights[i].second<<"\n";
   std::cout<<"Efficient search for the value\n";
   x="Olympus Mons"; 
   std::pair<long,std::pair<std::string,double> > ikv;
   ikv=mHeights.findIKV(x);
   std::cout<<"The index of "<<x<<" is ";
   std::cout<< ikv.first<<"\n";
   std::cout<<"The key of "<<x<<" is ";
   std::cout<<ikv.second.first<<"\n";
   std::cout<<"The value of "<<x<<" is ";
   std::cout<<ikv.second.second<<"\n";
   return 0;
}

3.2. Practice problems

Problem 6.

Each item that bakery sells has name of type std::string and price of type double. The user inputs the names and prices. The input ends when the user provides the word endOfInput instead of a name for an item. Create a program that checks whether chocolateDonut is in the map. If it is, then the program should print its price. If chocolateDonut is not in the map, then the program should print Not found.

Your code should replace the text // ??? // below.

#include<iostream>
#include "simple_set_and_map.cpp"
int main(){
   ssm::map<std::string, double> bakery;
   std::string itemFromUser;
   double priceFromUser;
   std::cin>>itemFromUser;
   while(itemFromUser!="endOfInput"){
      std::cin>>priceFromUser;
      bakery.insert(itemFromUser,priceFromUser);
      std::cin>>itemFromUser;
   }
   // ??? //
   return 0;
}

Problem 7.

Each item that bakery sells has name of type std::string and price of type double. The user inputs the names and prices. The input ends when the user provides the word endOfInput instead of a name for an item. After the items are inserted in the bakery, the user inserts a shopping list. The shopping list is an array of items that ends with the word endOfShoppingList. Create a program that prints the total price that someone has to pay for all the items on the shopping list. If any of the items on the shopping list is not in the bakery, then the program should print the number -1 instead of the total price.

Your code should replace the text // ??? // below.

#include<iostream>
#include "simple_set_and_map.cpp"
int main(){
   ssm::map<std::string, double> bakery;
   std::string itemFromUser;
   double priceFromUser;
   std::cin>>itemFromUser;
   while(itemFromUser!="endOfInput"){
      std::cin>>priceFromUser;
      bakery.insert(itemFromUser,priceFromUser);
      std::cin>>itemFromUser;
   }
   // ??? //
   return 0;
}