Search code examples
javaarraylistgroupingclustered-index

Creating "ordinal" clusters based on List Integer elements in Java based on unique set


I am trying to create ordered clusters starting from 1 using numbers in a List Integer.

For example, if I have a List Integer like: [-1, 7, 99, 4, 5, 33, 6, 4, 77, 3, 7, 99, 2, 7], these numbers are clusters returned by an algorithm. The algorithm won't create consecutive numbering like 1, 2, 3... but rather would "jump" randomly.

So what I want to achieve is more or less, a cleaned version of the clusters. The only exception is that any number in the above list that is -1, will remain -1 in the new List of ordered numbered clusters.

To illustrate this, suppose the above list, I create a set of unique elements: {-1, 2, 3, 4, 5, 6, 7, 33, 77, 99} for these unique clusters, I would like to create new numbering, for instance changing the set to say {-1, 1, 2, 3, 4, 5, 6, 7, 8, 9} to replace the previous set while keeping -1 intact. Each index in previous set corresponds to index in the new set.

Having that new set, I want to then run through the List Integer and update it accordingly. So, for the example above I would have: [-1, 6, 9, 3, 4, 7, 5, 3, 8, 2, 6, 9, 1, 6].

What I have done so far?

import java.util.*;

public class testing {
    public static void main(String[] args) {

    int[] myIntArray = new int[]{-1, 1, 2, 3, 4, 5, 5, -1, 7, 5, 9, 5, 5, 10,
            4, 14, -1, 5, 5, 5, 5, 5, 14, 5, 22, 5, 5, 25, 5, 22, 22, 5, 5, 5, 4, 5, 4, 7, 5, 5, 14, 14, 5,
            5, 22, 9, 2, 5, 22, -1, 47, 5, 5, 5, 5, 5, 4, -1, -1, 5, 5, 22, 5, 5, 5, 9, 5, 5, 5, 5, 65, 5,
            5, 5, 5, 14, 5, 10, 5, -1, 5, 22, 5, 14, 14, 5, 5, 5, 5, 5, 22, 5, 5, 5, 5, 5, -1, -1, 90, 22,
            -1, 92, 47, -1, 65, -1, 47, -1, 5, 1, -1, 7, 47, 92, -1, 9, -1, 9, -1, 103, 47, 3, 14, 107, 1,
            92, -1, 4, -1, 4, 14, -1, 9, -1, -1, 22, -1, 9, 22, 92, 25, 92, 9, 14, -1, 92, 103, 47, 4, -1,
            22, 9, 92, 47, -1, 47, 9, 7, 107, -1, -1, 47, 9, 14, 4, 47, -1, 22, 4, 22, 9, 9, 90, -1, -1, 4,
            4, 22, 22, 103, 47, 47, -1, -1, 9, 14, 9, 4, 4, 22, 22, 159, 9, 103, 4, 22, 4, 159, 90, 4};

    List<Integer> myListInteger = new ArrayList<Integer>(myIntArray.length);

    // passing values to myListInteger from myIntArray
    for (int i : myIntArray) {
        myListInteger.add(i);
    }

    // get distinct numbers in myListInteger: Set
    Set<Integer> distinctNumbersSet = new HashSet<Integer>(myListInteger);

    // convert to List
    List<Integer> distinctIntegerList = new ArrayList<>();
    for (Integer i: distinctNumbersSet) {
        distinctIntegerList.add(i);
    }

    // index to start numbering unique values
    int index = 1;
    boolean increaseIndex = false;


    for (int i = 0; i < distinctIntegerList.size(); i++) {
        for (int j = 0; j < myListInteger.size(); j++ ) {
            if (myListInteger.get(j) == -1) {
                continue;
            }

            if (distinctIntegerList.get(i) == myListInteger.get(j)) {
                myListInteger.set(j, index);
                increaseIndex = true;
                continue;
            }
        }
        if (increaseIndex == true) {
            index++;
            increaseIndex = false;
        }

    }

    // after update the myListInteger, I can get distinct sets again
    Set<Integer> distinctSetAfterUpdate = new HashSet<Integer>(myListInteger);

    System.out.println(myListInteger); // there is a 159 almost at the end, while it is expected that it should be 18

    for (Integer ind: distinctSetAfterUpdate) {
        System.out.println(ind + ": " +  Collections.frequency(myListInteger, ind));
    }



    }
}

Problem I get

The highest cluster in the list: 159 which appears twice, is not going to new cluster 18... If I try to visualize the distribution on the new mapping, somehow this 159 appears as cluster with 1 value and 18 appears with 1 too..., while based on my logic in the code this new cluster mapping should never go past the size of the the set.

So my current output for visualizing the distribution is:

-1: 33
1: 3
2: 2
3: 2
4: 17
5: 56
6: 4
7: 16
8: 2
9: 12
10: 19
11: 2
12: 12
13: 2
14: 3
15: 7
16: 4
17: 2
18: 1
159: 1

while I want to get

-1: 33
1: 3
2: 2
3: 2
4: 17
5: 56
6: 4
7: 16
8: 2
9: 12
10: 19
11: 2
12: 12
13: 2
14: 3
15: 7
16: 4
17: 2
18: 2

Any help trying to understand why my code does not map the 159 twice into 18 but only once?


Solution

  • The problem is in this line:

    if (distinctIntegerList.get(i) == myListInteger.get(j))
    

    You have Integer types in your lists. The == is used to compare primitive types (int, long, double ..). You should always use equals method when comparing reference types (Integer, Double, Long)

    Change that line to

    if (distinctIntegerList.get(i).equals(myListInteger.get(j)))