Search code examples
c++arrayssortingpseudocode

Create the Union of two String Arrays while having no duplicates


I just wrote my exam and one of the questions wanted us to get the union of A[] size N and B[] size N and remove duplicate (A and B can have duplicates within themselves) and put it into Z[] size 2N. We were just asked for pseudocode for this but it is a c++ class.

We were asked to make two versions of this one without using Heap space (no creating new arrays or data structures other than constant time local variables) and the other one was no restrictions. Also, both to be the fastest running time possible.

The one with restrictions I was only able to make it O(N^2) with a nest for loop and just iterating through Z[] as I was putting elements into Z from A and B.

This one was the one I was more interested in your guy's opinion/what would you do (for the one with no restriction) :

I got the following (Running time O(NlogN)):

Create an Array E size 2N Put everything from A and B into E - O(N)

Merge Sort E // Use ascii to sort - O(NlogN)

String previous

for loop from i = 0 to sizeOfE  { - O(N)

if previous does not equal E[i] then add to Z[] and the string previous equals E[i] - O(1)

}

Is this the fastest way/is it correct? How would you have done this problem?


Solution

  • For the first option, I would scan A to find a value for which half the values are below and half above. Then I would put that value in the middle, and put the lower values on the left and the higher on the right, and also throw out any matching values. This can be done without creating a new array. Then I would repeat on the two subarrays. This is a modified quick sort, so it is O(nlog(n))

    Then I would repeat with B.

    Then I would merge the two into Z, an O(n) operation, but the merge of the two would omit a value if a value from A matches the value from B.

    So the whole thing is O(nlog(n)).

    For the one with no restrictions, I would do a straight merge sort on A and then B, except when merging, you omit a value if it matches the previous inserted value, and you do it throughout the merge sort. This is a bit quicker since duplicates are deleted sooner in some cases. Overall the merge sort is also O(nlog(n)).