## Find ‘k’ smallest numbers from a million numbers (and take the girl with you!)

Your mum just told the “aunty next door”, the obese mother of your 6-year-crush “girl next door” that you are extremely smart and good in Math. WTF, mom?! And aunty being the crook she is, gives you her shopping list which has 1000 products with their prices and asks you to pick the least-values 10 products. Your crush is right there. If only you were in the Matrix, right?

Scale out the problem to finding the k smallest numbers from a million numbers and throw in a computer to solve the problem with code. What now? How do you go about it?

Well, you can** sort all elements in O(n log n) time** and pick the first k numbers. But can’t you try and be a little smarter? Even the girl knows basic sorting!

Maybe try and find a **worst case O(n) complexity algorithm**? Oh, stop staring! So here is the magic mantra: **Hoare’s Quick Select **algorithm.

In this algorithm,** we use the technique of Partitioning from Quick Sort** to get what we want. Partitioning basically takes a pivot element at random, puts it in its right place and puts all numbers lesser that the pivot to one side and the rest on the other. *So if we are able to get the “k”th smallest number as the pivot, we are done!** *(as we can then pick the whole partition with k elements as our answer)

1. Start with a random pivot. Iteration number = 1 for the first time.

2. Partition the array. Takes O(n/iteration number) time.

3. If the pivot is placed at the kth position (relative to original array), we are done! Else, first increment Iteration number. Then : If it is placed before the kth position, go to step 2 with the right hand partition as the array. Otherwise go to step 2 with the left partition.

Got the point? This can be a very helpful method to solve problems such as k largest or smallest element, etc.!

Now to **prove time complexity O(n) in worst case**. Since we process just one partition each time (as opposed to processing both partitions like we do in Quick Sort), here is how the effort looks –

**Effort = n + n/2 + n/4 …**

We know that **1/2+1/4+1/8+… = 1**, (can prove it too)

Hence,** Effort = n + n = 2n = O(n)**

Tada! 🙂

## Find the element which occurs > n/2 times in an array with n elements

Guess what? The Election Commissioner just called upon you to count the **10 billion votes** and find out the person who has majority votes! You are flabbergasted! You had a weekend of good sleep planned out with snack breaks every 2 hours and now you are stuck with this grunt work! **However,** **the EC ensures you that the one who wins will have more than half the total number of votes!**

If you think of **keeping a counter** for each of the (assume **(5 billion – 1) candidates**) and then go on to find the one with the maximum number of votes by incrementing counts, well you can forget the next 3000 weekends and say goodbye to your sleep.

And if you have the brilliant idea of **sorting the votes in some order** and getting the answer, you might just want to find one of those supercomputer’s access key.

What you should notice is that more than half the votes are for the winner. This can be solved using **Moore’s Voting algorithm in O(n) time with constant space**. Here is what you do.

1. Take two variables,current_candidateandcount(which corresponds to count of current candidate)

2. Set count to 0 initially.

3. Now everytime you find the count to be 0 (even at the beginning), make the current element as thecurrent_candidate.

4. Move forward in the array. If you find element the same as current candidate, increment count. If not, decrement count.

5. If count is 0, go to step 3. Else, move forward in the array

6. After traversing the whole array you’ll be left with the majority vote as thecurrent_candidate.

Bam! Want an example?

**B B C G H B B B T**

*Start with* candidate = (none), count = 0

First is a B, candidate = B, count = 1

moving forward, another B, candidate = B, count = 2

now a C, candidate = B, count = 1

and a G, candidate = (none), count = 0

now an H, candidate = H, count = 1

now a B, candidate = (none), count = 0

and another B, candidate = B, count = 1

and one more B, candidate = B, count = 2

and finally T,** candidate = B**, count = 1

Hence, you arrive on the answer candidate –> B !

Weekend plans to sleep might still pan out, eh? Cheers! 🙂

## Generate maximum number from array of digits which is divisible by 2, 3 and 5

Whoa! Blogging 2 questions in one day! I guess I am actually in the mood to write tonight! Lets get started.

Let us take an example array, {9, 6, 3, 4}

So how should we go about solving for this example? **Well, you don’t!** Because if you notice, apart from 3, we are looking for divisibility by both 2 and 5. Which means that the final number should end in a “0”. And this array does not have a zero!

Moving on, here is the new array, {9, 6, 3, 4, 0}

Now what? Divisibility of 3 means that the sum of digits of the number should be divisible by 3. **Note that you don’t have to use up all of the digits.** So what do you do?

You can either go for an **exponential approach by forming the power set of the array** and shortlist the ones which have a sum divisible by 3 and then go on to see which of these has the most number of digits and even then if you have a tie, you’ll go for the one with the biggest digit.. and so on. **But wait… I thought we were smart! Then why do all this?!**

You can **instead** do the following –

1. Check if the array has atleast one 0 (zero). If not, we can’t form a number. Go back to bed.

2. Since you are still awake, go on to sort the array in descending order.

3. Now sum up the digits (and keep repeating Steps 3-6 till sum>=3). If your sum%3 == 0, we are done! This sequence is the biggest number. Sleep peacefully.

4. If not, there are 2 possibilities, sum%3 = 1 or 2 (let us call this remainder “R” which can have value 1 or 2)

5. We’ll first try to remove a number from this array which satisfies number%3==R starting from the smallest number in the array. As soon as you find such a number, remove it!

6. If not (life is such), go on to remove 2 numbers starting from the smallest number such that numbers%3==(3-R). Once you have removed 2 such numbers, you might have your desired number! Loop back to Step 3 and check the sum!

7. In the end you might be left with an array with only a 0. In that case your answer is either Aryabhata’s creation or “Cannot be formed” depending on the question’s requirement.

To quickly run through the example, {9, 6, 3, 4, 0}

In descending order, this is, {9, 6, 4, 3, 0} which sums to 22

22%3 = 1

Hence start trying to find a number%3=1 from the smallest,

3%3 = 1? No!

4%3 = 1? Yes!

Remove 4 and we are left with {9,6,3,0} or 9630 –> *Which is the panacea for our metaphorical insomnia*! Have fun! 🙂

## Given N numbers, find pairs with difference K

*Problem Statement :* You have been given N numbers and a value K. Find the number of pairs in this given list of numbers whose difference is exactly equal to K.

This is a very simple one. Trying a brute force over all the elements is not being efficient. So what we’ll do is that we’ll sort the input array and apply a fine-tuned brute force as follows.

Once the array is sorted, for every number at position j, you start checking numbers backward from j to 0 (let us say using variable i, where i<j) and skip to checking for j+1 if the difference of the jth and ith element exceeds K. If you find their difference to be exactly K, you increment your count.

A very simple implementation of sorting. 🙂

## Boggle Solver

A facebook recruiter asked me to write the code for this particular problem. While I was able to suggest the algorithm correctly, I messed up my code. So here is the *problem statement*:

**Given a 4×4 array with randomly selected letters in it and a dictionary of valid words, find which words can be made from this array by starting from any of the 16 letters and moving in any direction (the only constraint being that you cannot come to a node twice while making a word)**

The first thing you should see here that this is clearly a Graph problem. The second noticeable aspect is that the maximum length of a word from such an array (with the constraint) can be 16 and the minimum length can be 1.

Since you have to look for all possible combinations starting from every point, Depth First Search comes to mind (why not Breadth First Search? Think about it). What you should do is* apply DFS starting at each of the 16 nodes in the array.* At every step, you add the newly formed word to a data structure meant to store strings (not allowing duplicates).

Once you have this data structure ready with you, all you need to do is search for each word in the dictionary that you have been provided. 🙂

Clear enough?

PS: Please code the DFS in the 4×4 array. It is not as simple as you assume it to be. 🙂

PPS: Does anyone have a better solution?

## Shuffling Numbers

I recently talked about generating random numbers. What if you are given a set of numbers and asked to shuffle the numbers randomly. What are the different methods that you can think of? There is basically one method you should remember, because finally every algorithm will take you to the same basic thing, which is,

Generate a random number between 1 and n (included), where n is size of the given set. Now put the number at that position in your array in the first place. Repeat the process for the n-1 remaining places. The randomness of the shuffling here will be decided by your random number generator’s randomness.

(A method suggested by Fisher-Yates and multiple others as well. Big deal!)

That is what most algorithms to shuffle numbers are all about. You can ofcourse fine tune your method to give you your shuffled sequence in O(n). Some old algorithms tend to run for O(n*n), but those are just too stupid and you can easily figure out how to fine tune them to O(n).

Yeah, that was as simple a post as they come!

**PS: I don’t have a lot of time right now to write a full-blown algorithm for some problem right now. Sorry for the gap.**

## Edit Distance using Dynamic Programming

In the previous post, I discussed a recursive method for finding the *Edit Distance *(once again, refer *previous post* for details) between two strings. Some of you might have noticed that the recursive function (in this case) could just as easily be utilized in a Dynamic Programming method.

However, we would be increasing the space complexity by using a 2D array (as is the case with many Dynamic Programming problems). The time complexity would reduce to O( length of String1 * length of String2 ). Since the whole logic has already been explained in the previous post, here is the algorithm.

int EditDistance( String1, String2 )

{

m = String1.length

n = String2.length

For i=0 , i<=m , i++

V[i][0] = i

For i=0 , i<=n , i++

V[0][i] = i

//cells in first row and column of temporary 2D array V set to corresponding number

For i=1 , i<=m , i++

{

For j=1 , j<=n , j++

{

If ( String1[i-1] == String2[j-1] )

V[i][j]=V[i-1][j-1]

Else

V[i][j] = 1 + min( min(V[i][j-1], V[i-1][j]), V[i-1][j-1])

}

}

RETURN V[m][n] //last element of 2D array (last row, last column)

}

So the logic remains the same. The only difference is that we avoid the use of system stack and instead store previous results in a 2D array. Neat! 🙂