## Forming Cricket Teams (Balanced Partition problem)

*Problem Statement*: You have been given N players, with a corresponding strength value S for each player (Minimum strength being 0 and maximum being K). You are supposed to divide them into two teams such that the difference in the total strengths of both teams is minimized.

This is an everyday problem that we tackle while forming equally matched teams. *You might try a greedy approach here but in many cases it will fail*. It is actually known as a Balanced Partition problem and utilizes the method we used while solving the integer knapsack (1/0) problem. Great! So you know half of it! But let us give it a quick look.

We defined a term P(i,j) which specified that I would find a subset of elements in the first ‘i’ elements which sum up to ‘j’. Now we can define the term recursively, as,

P(i,j) = maximum {P(i-1,j), P(i-1, j-Si)}

While you can refer to the link given to the earlier post, what this formula specifies is that we either have a subset that sums up to ‘j’ in the first ‘i-1′ elements, or that we have a subset in the first ‘i-1′ elements whose sum when added to the ith value Si gives us ‘j’.

Now that we are done refreshing our memory, we’ll move on to the new problem.

We have now generated a 2D array giving us the possibility of a sum (j) being formed using some (<i) elements at each step. Now we need to divide this into 2 sets Team 1 and Team 2 such that the different of their totals is minimized.

For this, we’ll first find the mean value of the list of N elements. If a subset’s sum hits this particular value, we have found a solution with difference 0. Otherwise, we keep trying to find a solution where the sum of strengths of teams is closest to the mean. For this, we can once again check for all ‘i’<=’Mean’,

Minimum {Mean-i : P(n,i)=1}

where 1 indicates we have such a subset. What we are doing is that we are checking for every value less than the Mean it can be formed from any subset of the N elements. Checking this one by one, we land up with the required value closest to the Mean.

There it is. You have your solution! Divide you teams and game on! :)

(Retrieving the values for the subsets is as simple as maintaining back-pointers while creating the 2D array and filling it up.)

Doubts? Ask away!

PS: Need to be spoonfed (euphemism for “Me.Want.Code.”) ? Write it yourself! :P If you just can’t, drop me an email and I’ll mail it to you!

## Santa Claus is coming to town. (Integer Knapsack 1/0)

I hope you remember the US Pizza problem we talked about a while back. That wasn’t a 1/0 problem but the one we will discuss now certainly is.

: Santa Claus is coming to your city and needs to figure out the gifts he has to take. Since North Pole is a long way off, he can only carry gifts weighing a total of ‘C’ in his bag. He has a total of ‘N’ gifts to select from. You are supposed to help him select the gifts in a way that would use the capacity to the greatest use, i.e., the total weight of the gifts selected should be closest to C and the most number of gifts should be selected.Problem Statement

This is a dynamic programming problem and you should recognize it as soon as you see that the problem has optimal substructures in the fact that its solution can be built starting from 1 to i gifts. Also, this is a 1/0 knapsack problem since you can either select a gift (1) or leave it behind (0).

Let us define a term

M(i,j) as the optimal way of selecting gifts in a way such that selecting 1..i gifts brings the total weight up to exactly j. So your value of i varies from 1 to N and the value of j varies from 0 to C. Hence, the time complexity of the problem is equal to O(NC). (Basically, M(i,j) in a 2D array stores the number of gifts)At every point, there are

two waysin which the value of M(i,j) can be determined.

1. Check if there was a subset of gifts from number 1 to i-1 which formed a subset with total weight equal to j. You have the value M(i-1,j) as one of your candidate values.

2. Check the M(i-1, j-Wi) value such that adding the ith gifts weight gives us j. (i.e, weight of ith gift is Wi). This value plus 1 (because we pick up this ith gift as well) is another candidate value.

Now you simply have to take the maximum of these two values and place the value at M(i,j). In this way, you end up finding M(n,C) in the generated 2D array which is the required answer that our Santa Claus requires.

Here, you should notice how we built the solution for the problem in a manner similar to what we might do for a balanced partition problem. I will discuss this problem in the next post.

Stay tuned for more Dynamic Programming solutions. :)

## Minimizing multiplications in Chain Matrix Multiplications

*Be patient and read every line. You’ll end up understanding a great problem of Dynamic Programming.*

*Problem Statement: *Parenthesize a Matrix Chain multiplication matrix in such a way that the number of multiplications required are minimized.

*First things first, how many multiplications are involved in multiplying 2 matrices?*

If your 1st matrix is of dimensions mxn and your second matrix is of dimensions nxq, you will need to perform mxnxq multiplications in this matrix multiplication.

So if your matrix multiplication chain is as follows,

A x B x C x D , we have the following scenarios,

A = 50 x 20

B = 20 x 1

C = 1 x 10

D = 10 x 100

So you see how we minimized the cost of multiplications there. Now we want to develop an algorithm for the same. You can check all possibilities but as you will realize that you will have very high complexity with such a Brute Force method.

As we discussed a parenthesis problem in my previous post, your mind should jump to find an optimal substructure here. Now you can see all these parenthesizations as a binary tree with these matrices as the leaves and the root element being the final product of all matrices. For example, you can view ((AxB)xC)xD) as,

Hopefully you understood everything till here. Now for a tree to be optimal, its subtrees should also be optimal. So here we come across our optimal substructure. Hence, subproblems are of the form,

(That is the representation of any one node in some binary tree except for the leaf nodes)

Thus we define this sub-problem as,

C ( i, j ) = minimum cost of multiplying

Size of this subproblem = j-i

Hence, minimum possible size is when i=j (single matrix), so C ( i, i) = 0

For j>i, we consider optimal subtree for C ( i, j ). Now, we can divide this from Ai to Ak and Ak to Aj for some k>i and k<j. The cost of the subtree is then the cost of these two partial products plus the cost of combining these. So we land with the recurrence,

C (i, j) = min { C(i, k) + C(k+1, j) + m(i-1) x mk x mj }

where i<=k<=j.

Coding this is pretty simple now and I need not (and should not) spoonfeed it to you. All you need to do is generate this 2D matrix and another one which holds details about parenthesizations and you’ll end up with your answer. :)

PS: If someone requires the code, sorry won’t give you that. If you need the algorithm, drop a comment or a mail. That I can write down.

## Counting possible parenthesizations in Matrix Chain multiplication

*Problem Statement:* Given the number of matrices involved in a chain matrix multiplication, find out the number of ways in which you can parenthesize these matrices in the multiplication, i.e., you have to find the number of ways you can introduce brackets in this chain matrix multiplication. (forming groups of 2).

For example,

A x B x C can be grouped in 2 ways,

(A x B) x C

A x (B x C)

You should remember that while matrix multiplication is not commutative, it is associative. Hence both the multiplications mean the same.

How do we find the number of such parenthesizations?It becomes simple when you realize that you can derive solutions for this problem using solutions to smaller subproblems which you calculated earlier. Dynamic Programming :)So for,

N=1, P(N) = 1

N=2, P(N) = 1 (easy to figure out)

N=3, P(N) = 2 (again easy)

So what we are doing here is finding out the previous solution and figuring out the different ways you can introduce one more matrix in the picture to find the next. So your function looks like,

I guess you understand how this is Dynamic Programming. And also the fact that this is a very simple permutations question. :) Not really a programming problem.

Why did I write this post?There is a genre of DP problems known as Optimal Matrix Chain multiplication. You are supposed to find the way (out of these P(n) ways) in which the number of multiplications required will be the least.We will solve this problem in the next post.

Till then, adios! :)

## Shortest Path in Directed Acyclic Graphs

*Problem Statement :* Given a DAG, find the shortest path to a particular point from the given starting point.

This one is a famous problem that involves dealing with graphs. I assume everyone knows what a graph is. If you don’t, this is not the place for you to be. Please visit this wiki link for knowing more about Graphs. Let us get down to talking about DAGs (Directed Acyclic Graphs). *Directed* means that each edge has an arrow denoting that the edge can be traversed in only that particular direction. *Acyclic* means that the graph has no cycles, i.e., starting at one node, you can never end up at the same node.

Now that we are clear with what a DAG is, let us think about the problem. If we have the following DAG, what kind of algorithm would you use to solve this problem.

One thing you can do is apply an O(N*N) algorithm and find the shortest distance by calculating all the distances from each node to every other node. As you can guess, a better method exists. You should notice here that there is subproblem quality that exists here. If a particular node can be reached using two nodes, you can simply find the minimum of the values at the two nodes plus the cost for reaching to the node in question. So you are extending your previous solutions. Yes, Dynamic Programming. :)

The reason you are able to find such an optimal substructure here is because a DAG can be linearized (Topologically sorted). For example, the above DAG is linearized as follows,

Now if we want to find the minimum distance from start S to A. All we need to do is see the minimum distances to reach the nodes from which A can be reached. In this case, these nodes are S and C. Now the minimum of these minimum distances for previous nodes (till S=0, till C=2) added with the cost of reaching from these nodes (from S = 1, from C=4), is our answer.

So let me define the recursive relation for this case,

MinDist (S to A) = minimum ( MinDist(S to S) + Dist(S to A), MinDist(S to C) + Dist(C to A) )

Generalize this recurrence and you have a simple Dynamic Programming algorithm ready to solve the problem for you in linear time! Cheers! :)

## Edit Distance using Dynamic Programming

In the previous post, I discussed a recursive method for finding the *Edit Distance *(once again, refer *previous post* for details) between two strings. Some of you might have noticed that the recursive function (in this case) could just as easily be utilized in a Dynamic Programming method.

However, we would be increasing the space complexity by using a 2D array (as is the case with many Dynamic Programming problems). The time complexity would reduce to O( length of String1 * length of String2 ). Since the whole logic has already been explained in the previous post, here is the algorithm.

int EditDistance( String1, String2 )

{

m = String1.length

n = String2.length

For i=0 , i<=m , i++

V[i][0] = i

For i=0 , i<=n , i++

V[0][i] = i

//cells in first row and column of temporary 2D array V set to corresponding number

For i=1 , i<=m , i++

{

For j=1 , j<=n , j++

{

If ( String1[i-1] == String2[j-1] )

V[i][j]=V[i-1][j-1]

Else

V[i][j] = 1 + min( min(V[i][j-1], V[i-1][j]), V[i-1][j-1])

}

}

RETURN V[m][n] //last element of 2D array (last row, last column)

}

So the logic remains the same. The only difference is that we avoid the use of system stack and instead store previous results in a 2D array. Neat! :)

## Derangement

You must have heard of arrangement, but today we’ll talk about Derangement. The exact opposite. Derangement is the concept of arranging a set of given items in such a manner so that no object is in its original place. For example, a question might say,

If you are given 4 cards with A, B, C, D written on them and they are placed in this specific order. Now you are supposed to count the number of ways these cards can be arranged so that no card is in its original place.

The answer is 9. And the different arrangements are as follows:

BADC, BCDA, BDAC,CADB, CDAB, CDBA,DABC, DCAB, DCBA

So how do you go about solving such problems? Well, of course you use Derangement. So let us discuss how we count derangements. I’ll just pick up a short explanation from Wikipedia since I am too lazy to explain it myself :P

Suppose that there arenpersons numbered 1, 2, …,n. Let there benhats also numbered 1, 2, …,n. We have to find the number of ways in which no one gets the hat having same number as his/her number. Let us assume that first person takes the hati. There aren− 1 ways for the first person to choose the numberi. Now there are 2 options:

**Person***i*does not take the hat 1. This case is equivalent to solving the problem with*n*− 1 persons*n*− 1 hats: each of the remaining*n*− 1 people has precisely 1 forbidden choice from among the remaining*n*− 1 hats (*i’*s forbidden choice is hat 1).**Person***i*takes the hat of 1. Now the problem reduces to*n*− 2 persons and*n*− 2 hats

*subfactorial of n*( denoted as !n ). Now there is a simple recurrence relation we can see from the explanation above,

## Maximum Sum Path in a Triangle

*Problem Statement:* You have been given a triangle of numbers (as shown). You are supposed to start at the apex (top) and go to the base of the triangle by taking a path which gives the maximum sum. You have to print this maximum sum at the end. A path is formed by moving down 1 row at each step (to either of the immediately diagonal elements)

The first method that comes to everyone’s mind is **brute-force**. But in a second or two, you would discard the idea knowing the huge time complexity involved in traversing each possible path and calculating the sum. What we need is a smarter approach.

I hope you remember about the *optimal substructure *I talked about earlier relating to Dynamic Programming. Can you see one such optimal substructure here? You can *build* a solution as you drop down each row. You can calculate the best path till the particular node in the triangle as you move down and finally you’ll have your maximum sum in the last row somewhere. Here is the algorithm :

Take another similar array (or your preferred data structure), let us say MAX_SUM[][]

For each ELEMENT in particular ROW and COLUMN

{

If ( ELEMENT is FIRST ELEMENT of ROW)

{

MAX_SUM[ROW][COLUMN] = ELEMENT + FIRST element of (ROW-1)

}

Else If (ELEMENT is LAST ELEMENT of ROW)

{

MAX_SUM[ROW][COLUMN] = ELEMENT + LAST element of (ROW-1)

}

Else

{

MAX_SUM[ROW][COLUMN] = ELEMENT + maximum( element at [ROW-1][COLUMN-1], element at [ROW-1][COLUMN])

//recursive formula calculating max_sum at each point from all possible paths till that point

}

}

Once you have this array, you just need to find the maximum value in the last row of this new array and that will be your maximum sum! :)

For the example I took, the array MAX_SUM would look like, (and my answer would be 23)

What I am doing is for each element, checking the maximum of the possible paths to this node and adding the current elements value to it. If you still don’t understand, leave a comment and I’ll explain each and every step in excruciating detail. :P

**TASK FOR YOU:*** Find the elements of this path.*

## Longest Increasing Subsequence

*Problem Statement:* You are given an array with integers (negative, positive, zero). You are supposed to find the length of the longest increasing subsequence in the array. The elements of the subsequence are not necessarily contiguous.

Once again, as in the last problem, you cannot afford to try a brute force method and be called an even bigger fool. What you *can* do is look at the problem carefully and find the optimal substructure.* You can find the length of the longest increasing subsequence till each position and use this for the later positions.*

The way it would work is, at each position ‘i’ in input array A, go through the positions j<i in array B and find the largest value for which A[i]>=A[j]. Now you put the value of B[i] = the value you found + 1.

Did you understand what we did? Since we are supposed to find the length of the longest increasing subsequence, we found the element smaller than this whose corresponding value in array B (length of longest increasing subsequence till that point) is the highest. This value + 1 would then be the length of the longest increasing subsequence till ‘i’.

For example, for the input array A as follows,

-1, 4, 5 ,-3, -2, 7, -11, 8, -2

We would get the array B as follows,

1, 2, 3, 1, 2, 4, 1, 5, 2

Once again, traversing this particular array B and finding the largest value would give us the length of the longest increasing subsequence. :)

If the solution isn’t apparent to anyone, please refer the video link, here,

http://people.csail.mit.edu/bdean/6.046/dp/

** Task for you : **Find the elements that form this longest increasing subsequence. Pretty easy. :)

*PS: Many variations of this problem exist (strictly increasing, decreasing, strictly decreasing). Also, it is posed in different ways, some of which you can find at (http://www.uvatoolkit.com) by simply typing the phrase LIS as the search parameter. Enjoy! :)*

## Maximum Value Contiguous Subsequence (Maximum Contiguous Sum)

This is one of the first examples of Dynamic Programming that I’ll take up on this blog. It is a very famous problem with multiple variations. The reader should note that in problems involving DP, the concept is much more important than the actual problem. You should have a crystal clear understanding of the logic used so that you can use it in other problems which might be similar.

As we talked earlier in my (overview of Dynamic Programming), we need to find an optimal substructure (which is basically recursive in nature) in the problem so as to apply an iterative solution to the problem. For this particular problem, we’ll keep building the solution as we traverse the array (which complies with our thinking of solving simpler sub-problems to solve the bigger complex ones).

*Problem Statement**: *You are given an array with integers (positive, negative, zero) and you are supposed to find the maximum contiguous sum in this array. Hence, you have to find a sub-array which results in the largest sum.

Example, If the given array is,

5, -6, 7, 12, -3, 0, -11, -6

The answer would be, 19 (from the sub-array {7,12} )

If you go on to form all the possible sets and then finding the one with the biggest sum, your algorithm would have a mammoth-like time complexity. More importantly, you would be called a fool. :P So we use dynamic programming and finish this in time O(n) and people will think we are smart. ;)

Assume your input array is called A. What you need to do is form another array, say B, whose each value ‘i’ is calculated by using the recursive formula,

Maximum(A[i], B[i-1]+A[i])

What you are doing effectively is that you are choosing whether to extend the earlier window or start a new one. You can do this since you are only supposed to select continuous elements as part of your subsets.

The way B looks for the example I took earlier, is,

5, -1, 7, 19, 16, 16, 5, -1

So all you need to do now is traverse the array B and find the largest sum (since B has the optimized sums over various windows/subsets).

That should be easy to understand. If not, refer to video tutorial at:

*http://people.csail.mit.edu/bdean/6.046/dp/*

** Task for you : **Find the elements that form this maximum contiguous sum. Pretty easy. :)