Search code examples
c++listlinked-listfind-occurrences

Union of two linked lists using multiplicity function with in C++


I need to make an union of two unordered multisets passed from input, by using the "multiplicity" definition: The multiplicity of an element, also called absolute frequency, is the number of occurrences of the element 'x' in the unordered multiset 's'. In the multiSetUnion the multiplicity of an element is the Max of its multiplicity in the two multisets.

I already implemented correctly the int multiplicity(const Elem e, const MultiSet& s) function (returns number of occurrences in the multiset).

The multisets are singly linked lists.

Here is the algorithm I came up with:

for as long as the first list isn't empty
   if elem of the first list (multiset) is not in the second list (multiset)
      add elem in unionlist
   if elem of the first list (multiset) is in the second list (multiset)
      if multiplicity of elem is bigger in the first list than in the second one
         add elem in unionlist as many times as its multiplicity in list1
      if multiplicity of elem is bigger in the second list than in the first one
         add elem in unionlist as many times as its multiplicity in list2  
analyze the second element of the first list 

Here is my implementation of my algorithm, but it gives me errors when neither of the two lists are empty and I have no idea why:

MultiSet multiset::multiSetUnion(const MultiSet& s1, const MultiSet& s2)
{
    if (isEmpty(s1) && isEmpty(s2))
        return emptySet;
    if (isEmpty(s1) && !isEmpty(s2))
        return s2;
    if (!isEmpty(s1) && isEmpty(s2))
        return s1;
    MultiSet s3 = emptySet;    
    MultiSet aux2 = s2;            //THE FUNCTION DOESN'T WORK FROM HERE ON
    for (MultiSet aux1 = s1; !isEmpty(aux1); aux1 = aux1->next) { 
        if (!isIn(aux1->elem, aux2))
            insertElemS(aux1->elem, s3);
        if (isIn(aux1->elem, aux2)) {
            if (multiplicity(aux1->elem, aux1) > multiplicity(aux1->elem, aux2)) {
                for (int n = 0; n < multiplicity(aux1->elem, aux1); ++n)
                    insertElemS(aux1->elem, s3);
            }
            else {
                for (int m = 0; m < multiplicity(aux1->elem, aux2); ++m)
                    insertElemS(aux1->elem, s3);
            }
        }
    }
    return s3;
}

Could anybody please point out where am I doing wrong? Did I forget something in the algorithm or is this an implementation problem?

Edit: Here is how I have implemented the functions IsIn(const Elem x, MultiSet& s) and multiplicity(const Elem e, MultiSet& s):

bool isIn(const Elem x, MultiSet& s) {
    if (s->elem == x) return true;
    while (!isEmpty(s)) {
        if (s->elem!=x)
            s = s->next;
        else    return true;
    }
    return false;
}

int multiset::multiplicity(const Elem e, const MultiSet& s)
{
    if (isEmpty(s))    return 0;
    int count = 0;
    MultiSet aux = s;
    while (!isEmpty(aux)) {
        if (aux->elem==e) {
            ++count;
        }
        aux = aux->next;
    }
    return count;
}

Unfortunately I cannot use the vector library (or any STL library for the matter). The algorithm I proposed is the intentionally half of the solution (the part I'm having problems with). I am not getting any specific errors but the program simply stalls (it should instead print the first, the second and the union of the two multisets - the print function is correct and is called directly in the main; as for now I only get the correct output when one or both of the multisets is empty) and returns this: "Process returned -1073741819" (I am currently debugging on Windows).


Solution

  • Consider the following example:

    MultiSet s1({7, 7});
    MultiSet s2({5});
    

    If you now iterate over s1:

    1st iteration:        7    7
                          ^
                         aux1
    
    2nd iteration:        7    7
                               ^
                              aux1
    

    If you have multiple equal elements in s1, you will discover them more than once, finally resulting in adding the square of multiplicity (or product of both multiplicities, if the one of s2 is greater).

    On the other hand, as 5 is not contained in s1, you won't try to look it up in s2 either – still, it is there...

    To fix the first problem, you need to check, if the current element is already contained in s3 and if so, just skip it.

    To fix the second problem, you need to iterate over s2, too, adding all those elements that are not yet contained in s3.

    As is, the final result will be of pretty poor performance, though (should be somewhere in between O(n²) and O(n³), rather the latter). Unfortunately, you chose a data structure (a simple singly linked list – apparently unsorted!) that offers poor support for the operations you intend – and especially for the algorithm you chose.

    If you kept your two lists sorted, you could create an algorithm with linear run-time. It would work similarly as the merging step in merge sort:

    while(elements available in both lists):
        if(left element < right element):
            append left element to s3
            advance left
        else
            append right element to s3
            if(left element == right element):
                advance left // (too! -> avoid inserting sum of multiplicities)
            advance right
    append all elements remaining in left
    append all elements remaining in right
    // actually, only in one of left and right, there can be elements left
    // but you don't know in which one...
    

    Keeping your list sorted during insertions is pretty simple:

    while(current element < new element):
        advance
    insert before current element // (or at end, if no current left any more)
    

    However, as you expose the nodes of the list directly, you are always in danger that insertion will not start at the head element – and your sorting order might get broken.

    You should encapsulate appropriately:

    • Rename your current MultiSet to e. g. 'Node' and create a new class MultiSet.
    • Make the node class a nested class of the new set class.
    • All modifiers of the list should be members of the set class only.
    • You might expose the node class, but user must not be able to modify it. It would then just serve as kind of iterator.