Tuesday, March 15, 2011

K-Way Merge

The classic merge takes as input some sorted lists and at each step outputs element with next smallest key thus producing sorted list that contains all the elements of the input lists.

We can solve this problem using two ways.

Let the sorted arrays be A1, A2,...Ak

Way 1:
Merge (A1, A2), (A3, A4),...
Then there are k/2 sorted arrays now.
Repeat the above procedure to get the full array sorted. It is not always practical to have whole sequence in memory because of its considerable size nor to constraint sequence to be finite as only K first elements may be needed. Thus our algorithm must produce monotonically increasing (according to some comparison logic) potentially infinite sequence. This way wont work as the memory we can use is limited

Way 2:
Form a priority min heap from the first element of all the k arrays. Extract the min element. It is the min element among all the arrays. Now take the next element from the array from which the min element has come, and put it in the min heap. Repeat the procedure.

  • The queue will hold non empty sequences.
  • The priority of a sequence is its next element.
  • At each step we dequeue out of the queue sequence that has smallest next element. This element is next in the merged sequence.
  • If dequeued sequence is not empty it must be enqueued back because it may contain next smallest element in the merged  sequence.

Thus we will have queue of size that doesn’t exceed m (number of sequences) and thus making total running time O(n log m).  It works well with infinite sequences and cases where we need only K first elements and it fetches only bare minimum out of source sequences.

No comments: