Dynamic Programming

1. Introduction

Dynamic programming is a technique that improves the speed of certain recursive algorithms. Recursive algorithms often have exponential complexity. If \(N\) is the input value, the number of operations in a typical recursive algorithm is \(2^N\). Even when \(N\) is a reasonable number, the number \(2^N\) is not.

There are some good news. A lot of recursive algorithms are making the very same call multiple times. Actually, they are making the same calls huge number of times. The idea behind the dynamic programming is to prevent the repeat evaluation of the same functions on the same sets of data.

There are two approaches: memoization and tabulation.

Memoization uses a data structure that saves the results from all previous calls of the function. Then before a function is executed, we first check our database. If the database contains the result, then we just read the database and avoid calling the function. Memoization is achieved by first writing the recursive function, and then modifying the recursive function. The recursion is modified by adding the map in which we store the values of previous successful calls to the function.

Tabulation is an improvement to memoization. The algorithm is re-written to completely avoid recursion. Recursion starts with target value of the parameter and goes down to smaller values. The tabulation reverses this direction. We start by evaluating the function on small values, then store the results. We then increase the values on which the function is evaluated.

When the problems are easy, tabulation is actually more natural then memoization. With harder problems, the recursive algorithms are more elegant. When problems are hard, then recursion offers easier solutions.

2. Knapsack problem

2.1. Formulation of the problem

The pirates have found an island full of valuable items. They can't take all of them. The maximal weight that they can carry is \(M\). The items have values and weights. The pirates need an algorithm to decide which items should be chosen. They want to take the most valuable load whose total weight is smaller than or equal to \(M\). This is the precise formulation of the problem that specifies the input values and the constraints.

Problem 1. The user input consists of two positive integers \(n\) and \(M\) and two sequences \(x[0]\), \(x[1]\), \(\dots\), \(x[n-1]\) and \(w[0]\), \(w[1]\), \(\dots\), \(w[n-1]\) of positive integers. The numbers \(w[0]\), \(w[1]\), \(\dots\), \(w[n-1]\) are the weights of the items \(0\), \(1\), \(\dots\), \(n-1\). The numbers \(x[0]\), \(x[1]\), \(\dots\), \(x[n-1]\) are the values of the items. The number \(M\) is the maximal weight that can fit in a bag.
  • (a) Create a program that calculates the largest value that can fit in the bag.
  • (b) Create a program that prints the items that need to be selected to achieve this largest value.

2.2. Examples of input values

2.2.1. Example of sequences with 3 terms. Don't be greedy!

In our first example, the total number of items is 3. The maximal weight that can fit in the bag is \(M=20\). The values of the items and their weights are given in the following table:

\[\begin{array}{|c|c|c|c|}\hline \mbox{item}& 0 & 1 & 2 \\ \hline x& 30 & 16 & 17 \\ \hline w& 15 & 10 & 10 \\ \hline\end{array}\]

The input in our program will look like this:

3 20
30 16 17 
15 10 10

The maximal value that can fit in the bag is 33. This value is obtained when we skip the item 0 and take the items 1 and 2. Their values are \(x[1]=16\) and \(x[2]=17\). The total value is \(16+17=33\). The total weight of these two items is \(w[1]+w[2]=20\). This weight fits in the bag.

Notice that a greedy approach would give a wrong solution. If you are greedy and analyze items individually, then you will be impressed with the item 0. Its very high value 30 is hard to ignore. In addition, the ratio \(\frac{\mbox{value}}{\mbox{weight}}\) is 2. It is better than the same ratios for the items 1 and 2, which are 1.6 and 1.7. Nevertheless, the item 0 should be avoided. When the capacity of the bag is 20, it is best to save the space for items 1 and 2. In this situation, the items 1 and 2 together are better than the item 0.

2.2.2. Example of sequences with 8 terms

Our next example has 8 items. The maximal weight that can fit in the bag is \(M=25\). The values of the items and their weights are given in the following table:

\[\begin{array}{|c|c|c|c|c|c|c|c|c|}\hline \mbox{item} & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7\\ \hline x& 1 & 50 & 10 & 22 & 1 & 20 & 20& 1\\ \hline w& 2 & 20 & 4 & 6& 6& 12 & 6 &2\\ \hline\end{array}\]

The input in our program will look like this:

8 25
1 50 10 22  1 20 20  1
2 20  4  6  6 12  6  2

The maximal value that can fit in the bag is 62. This value is obtained when we take the items 3, 5, and 6. Their values are \(x[3]=22\), \(x[5]=20\), \(x[6]=20\). The total value is \(22+20+20=62\). The total weight of these three items is \(w[3]+w[5]+w[6]=24\) and this weight fits in the bag.

2.2.3. Example of sequences with 10 terms

Here is another example. Let us consider the following input values:

10 90
15 15 15 25 25 35 45 45 45 50
20 20 20 10 10 10 30 30 30 40

Let us put the above input values in a table.

\[\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|}\hline \mbox{item}& 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7& 8 & 9\\ \hline x& 15 & 15 & 15 & 25& 25& 35 & 45& 45 & 45 & 50\\ \hline w& 20& 20 & 20 & 10 & 10 & 10 & 30 & 30 & 30 & 40\\ \hline\end{array}\]

The sequences are of length \(n=10\). The maximal weight is \(M=90\). The best value is 175. There are multiple ways to pick items whose value is 175. The items 3, 4, and 5 should be picked. In addition to these three items, we should take any two out of the items \(\{6,7,8\}\).

2.3. Recursive solution to part (a)

We will first focus on part (a) of the problem. We will not determine the sequence of items that need to be taken. We will only determine the maximal value that fits in the backpack. The solution is easier when we do not have to print the best sequence. We will later improve the solution. The improvement will give us the sequence in addition to the best value.

Therefore, after this first step, when we ran our program and insert the input values from 2.2.1, 2.2.2, and 2.2.3, we will only receive the outputs 33, 62, and 175.

We will create the function called bestValue. Its input will be two pointers x and w that will contain the addresses of the beginning memory locations of the sequences x and w. In addition, the input will contain the integers k, and L. The function will return the answer to the following problem:

Result of the function bestValue: The maximal value that fits in a bag of capacity L if the only items allowed are: 0, 1, ..., k-1.

Clearly, when we pass the values k=n and L=M, we will get the answer to the problem (a).

But n and M are not the only values that we will be giving to the function. The function will make recursive calls to itself. There will be a lot of recursive calls. In these recursive calls, the arguments k and L will take a diverse range of values that will have all sorts of combinations of numbers that are smaller than n and M.

2.3.1. Description of the algorithm

Once we receive the values k and L we will focus on only one out of the many decisions that we have to make. We will decide which one out of the two choices we are going to make.

  • Choice 1. Take the last item. The last item has the k-1. If it fits in the bag, then we are going to take it.
  • Choice 2. Skip the last item, whose name is k-1.

Observe that we must make either of the choices 1 or 2. Sometimes, the choice 1 is not available. This happens when w[k-1]>L. Otherwise, if w[k-1]<=L, then we have a choice to make.

Then we make recursive calls. If the first choice is made, then the recursive call will have the arguments k-1 and L-w[k-1]. The recursive call will return the best value that can fit in the bag. We will label this result by b1.

If the second choice is made, then the recursive call will have the arguments k-1 and L. The return value will be denoted by b2.

Once we receive the results b1 and b2, we then have to decide which is better. We need to add x[k-1] to b1, because if we made the choice 1, then the item k-1 will be added to the bag. The value b1 does not include the value of this last item. Hence, the function returns the maximum of {b1+x[k-1], b2}.

Of course, we need to think of basic cases when recursive calls should not be made. They are in the situations when L is 0 or a negative number, and when n is 0.

2.3.2. Implementation

Problem 2. Create the recursive solution to the part (a) of the knapsack problem in which only the maximal value is calculated.

2.4. Improvement of solution to part (a) using memoization

We will save the value whenever the function is executed. The arguments x and w are always the same. The arguments k and L differ from one call to another. However, it will turn out that the same combination of k and L is fed to the function numerous times.

We will use the map called knowledge. The keys to the map will be the pairs (k,L). The values will be the results of the function.

Whenever the function is called, it first checks the map. If the result is saved in the map, the function will immediately return the value from the storage. The function will not perform the computations.

Problem 3. Use memoization to improve the recursive solution to the part (a) of the knapsack problem in which only the maximal value is calculated.

2.5. Recursive solution to part (b)

Our next task is to improve the function bestValue from subsection 2.5. We want to be able to find out the items that need to be chosen to get the maximal value.

The old function returned the best value that can fit in the bag. The new function will return additional data. This additional data will make it possible to recover the entire sequence of items.

The problem can be solved with minimal intervention. We can add very little data to the object that the function returns. We will add only the index of the last item. We will now analyze one example and see how this minimal addition to the return object makes it easy to reconstruct the sequence.

We will analyze the example 2.2.2 again. When the old function is called with the values k=8 and L=25, we receive the result 62. The improved function will return the pair (62,6). The first component of the pair is the best value. The second component means that the item 6 is the last item that needs to be taken.

The function did not return all the items that we need to take. It return only the last item. However, this is a great start. By calling this function repeatedly, we can figure out the rest. The knowledge that we must take the item 6 is very helpful. In our next call to the function, we will reduce the capacity of the bag by w[6]. We will thus behave as if we have a bag of capacity 25-w[6]=25-6=19 and the sequence of length 6.

We now make the recursive call with parameters k=6 and L=19. The function returns the pair (42,5). We know that the next item to take is the item with the number 5. We prepare for the next call by reducing the capacity to 19-w[5]=19-12=7 and the length to k=5. When the parameters to the function are k=5 and L=7, the result is (22,3). Therefore, 3 is the index of another item that we should take. We will call the function again. The parameters will be k=3 and L=7-w[3]=7-6=1. The function will not be able to find any items to put in the bag. In this case the function will return the pair (0,-1). The first component 0 means that the maximal value is 0. The second component -1 is a signal that the bag is empty.

Problem 4. Create the recursive solution to the part (b) of the knapsack problem in which the sequence of items needs to be printed.

2.6. Improvement of solution to part (b) using memoization

The modifications to the code are analogous to the modifications that we did for recursive solution. The keys of the map knowledge are the same as before. The keys are the pairs (k,L). However, the values need to be modified. The values of the map must be the results of the functions. The results are now of type BestAdvice. The obtained code is listed as the solution to the problem below.

Problem 5. Use memoization to improve the recursive solution to the part (b) of the knapsack problem in which the sequence of items needs to be printed.

2.7. Performance comparison

2.7.1. Asymptotic analysis

The recursive solution has complexity \(O(2^n)\). Each function makes one or two recursive calls. Although some of the functions make only one calls to the recursion, the complexity is still \(O(2^n)\). There are pairs (n,M) for which \(2^n\) calls will be made. These situations occur when \(M\) is greater than or equal to the sum of all of the weights. In this case, the evaluation of b1 is never skipped.

The dynamic programming solution makes at most (n+1)(M+1) calls. All possible values for k belong to the set \(\{0,1,\dots, n\}\). All possible values for \(L\) belong to the set \(\{0, 1,\dots, M\}\). Thus the complexity of the solution that uses dynamic programming is \(O(nM)\).

2.7.2. Comparison on real test cases

We will count the number of times that the recursion was called. We will use a global variable recCounter. The declaration and initialization to 0 should be done by placing the command long recCounter=0; before the function bestValue. Then, the first command of the function should be ++recCounter;. Just as we are about to finish the execution of main(), we print the value of recCounter.

The variable recCounter should be added both to the recursive solution and the solution using dynamic programming. This is the modified code for the recursive solution:

#include<iostream>
long recCounter=0;
long bestValue(long* x, long* w, long k, long L){
  ++recCounter;
   if(k<1){return 0;}
   if(L<1){return 0;}
   if(w[k-1]>L){return bestValue(x,w,k-1,L);}
   long b1=bestValue(x,w,k-1,L-w[k-1]);
   long b2=bestValue(x,w,k-1,L);
   b1+=x[k-1];
   if(b1>b2){
      return b1;
   }
   return b2;
}
long* getSequenceFromInput(long n){
   long* a=new long[n];
   for(long i=0;i<n;++i){
      std::cin>>a[i];
   }
   return a;
}
int main(){
   long n, M;
   std::cin>>n>>M;
   long* x=getSequenceFromInput(n);
   long* w=getSequenceFromInput(n);
   long b=bestValue(x,w,n,M);
   std::cout<<b<<"\n";
   delete[] x; delete[] w;
   std::cout<<recCounter<<"\n";
   return 0;
}

We will not write the modified code for the dynamic programming solution. The modifications are done in the same way.

The test case 1 that has sequences of length 3 has similar performance. The number of calls is 11 in the case of standard recursion. The dynamic programming makes 10 calls. In the test case 2, the standard recursion is executed 240 times. The dynamic programming reduces this number to 88. In the test case 3, the dynamic programming offers more substantial improvement. It has 118 calls, while the standard recursion makes 815 calls.

We will few more additional test cases with bigger numbers n and M.

We will recycle the sequences x and w from the example 2.2.3. In our first test case we take n=35 and M=315 (we keep the ratio \(M/n=9\).)

35 315
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50
15 15 15 25 25 35 45 45 45 50
15 15 15 25 25
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 

The standard recursion on a standard desktop computer took one minute to finish. There were more than 17 billion calls to the function (the precise number was 17,676,798,207). The same test case runs instantaneously if the recursion is improved with memoization. The number of calls reduces to 1686.

If we add one more term and make n=36 and M=324, then our input turns into:

36 324
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50
15 15 15 25 25 35 45 45 45 50
15 15 15 25 25 35
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10

The standard recursion now take two minutes to finish. The number of calls exceeds 38 billion (the exact number is 38,683,248,912). The improved recursion is instantaneous again. It makes 1758 recursive calls.

At this point, whenever n increases by 1, the number of calls gets doubled. The execution time also doubles.

Let us now take n=100, M=900 and consider the input shown below.

100 900
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
15 15 15 25 25 35 45 45 45 50 
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40
20 20 20 10 10 10 30 30 30 40

The dynamic programming solution was virtually instantaneous. There were 14,344 calls.

However, if you try giving this test case to the standard recursion, it will take \(2^{64}\) minutes. A year has 366 days at the most; a day has 24 hours; and an hour has 60 minutes. Therefore, when we convert \(2^{64}\) minutes into years we obtain

\[\frac{2^{64}}{366\cdot 24\cdot 60} > 10^{13}.\]

Our galaxy Milky Way will collide with the neighboring galaxy Andromeda in 4.5 billion years. If you want to see the results of the standard recursive algorithm, then you would first need to re-schedule the collision of the galaxies. Those things can ruin recursions.

2.8. Tabulation for part (a)

Tabulation is a non-recursive method for solving the problem. The idea is the same as in the recursive method. In the case when \(L\geq w[k-1]\), the formula for the bestValue(k,L) is

bestValue(k,L) = max{x[k-1]+bestValue(k-1,L-w[k-1]), bestValue(k-1,L)}

Instead of using a function to provide us with the results bestValue(k,L), we will use a matrix. We will make a matrix bV of the format \((n+1)\times (M+1)\) and put the numbers bestValue(k,L) at the positions bV[k][L].

The matrix will be stored in one-dimensional sequence seqBV of length \((n+1)(M+1)\). The entry bV[k][L] will be kept at bV[k*(M+1)+L]. We evaluate the entries in the matrix by going from top to bottom and from left to right.

Problem 6. Use tabulation to solve the part (a) of the knapsack problem.

2.9. Tabulation for part (b)

We will use the same method as in subsection 2.5. For each pair (k,L) we will make it possible to determine the last item that needs to be taken if the capacity is L and only the first k items are allowed to be taken. In addition to the matrix bV we will maintain another matrix lastItem. The matrix lastItem will be updated at the same time as the matrix bV.

Problem 7. Use tabulation to solve the part (b) of the knapsack problem.

3. Problems

Problem 8. The user input consists of positive integers \(M\) and \(n\), and a sequence \(x[0]\), \(x[1]\), \(\dots\), \(x[n-1]\) of positive integers. The person with \(M\) dollars decided to go to store and to spend all of the money. The person has a shopping list of items whose prices are \(x[0]\), \(x[1]\), \(\dots\), \(x[n-1]\). Create the program that determines whether the person can choose a subset of items from the list whose total price is exactly equal to \(M\). If it is possible to spend all the money, then the program should also print the list of the items that need to be bought in order to spend the money.

Problem 9.

The user input consists of two positive integers \(n\) and \(M\) and two sequences \(x[0]\), \(x[1]\), \(\dots\), \(x[n-1]\) and \(w[0]\), \(w[1]\), \(\dots\), \(w[n-1]\) of positive integers. The numbers \(w[0]\), \(w[1]\), \(\dots\), \(w[n-1]\) are the weights of the items \(0\), \(1\), \(\dots\), \(n-1\). The numbers \(x[0]\), \(x[1]\), \(\dots\), \(x[n-1]\) are the values of the items. The number \(M\) is the maximal weight that can fit in a backpack. However, the backpack must not contain the items whose indices are adjacent integers. In other words, if the item \(i\) is in the backpack, then the items \(i-1\) and \(i+1\) must not be in the backpack. Create the program that calculates the largest value that can fit in the backpack and prints a list of items that need to be selected to achieve this largest value. Your program should have polynomial complexity in \(M\cdot n\) (If your solution involves a recursion, then you should use data structures based on balanced search trees to make the recursion more efficient.)

Example:

Input:
7 12
20 30 30 10 40 50 55 
 1 10  1  1  7  1 11
Output:
Best value that backpack can hold is: 100
Items to take: 5 2 0

Problem 10. For given \(n\) and \(\alpha\), determine the smallest number \(k\) for which \(n\) can be expressed as a sum of \(k\) positive integers each of which is a perfect \(\alpha\)-th power. Print the number \(k\) and the representation of \(n\) as the sum of \(k\) perfect \(\alpha\)-th powers. Create the program that uses data structures based on balanced binary search trees to achieve the polynomial complexity.

Problem 11.

The user provides a positive integer \(m\) and a sequence of m two dimensional vectors (x[i],y[i]), where \(i\in\{0\), \(1\),\(\dots\), \(m-1\}\). Each coordinate of each of the vectors is non-negative. The user input also includes the pair (x,y) of non-negative integers that represent the coordinates of a target point. The goal is to start at (0,0) and reach the target (x,y) with fewest possible jumps. Each jump must be one of the vectors (x[i],y[i]). Create a program that calculates the number of jumps on the minimal path and prints the jumps in a minimal path.

Example: Input:
5 17 17 17 18 0 1 1 1 6 7 
18 23
Output: 
5
<0,1>, <0,1>, <6,7>, <6,7>, <6,7>