I made some C# code that outputs a SortedDictionary <int index, list<int>values>>
Where the index always starts with lowest int of the list<> that follows. Here is a short example of input (in reality the input is larger):
index- Values<>
2 - 2,3,6,7
3 - 3,5
5 - 5,7,9
11 - 11,12,12
Now I want to do some re-orderning in here. The values are linked indexes. I want to sort them so that connected indexes are grouped together, with no double indexes. That should result in output
2 - 2,3,5,6,7,9
11 - 11,12
Initially i had problems working with a sortedDictionary using a foreach while also decreasing the dictionary set size. I solved that, and now give an update of this problem description with my latest code. It doesnt use a foreach anymore, and some sorting problems seam to be fixed now, but as a side effect it became pretty complex and large. And i doubt if it should be so complex, or might be written shorter, more simple.
Each List i call a tree, where is thoe dictionary are trees And Cursor or c i use like reading out text digit by digit from screen.
For the moment I put it in a small concept function code in a console app. Just to check if it all works. Testing is quite complex, since number sets can be complexly linked. So its not directly visible if you got lots of sets and a lots of number how they should be sorted. Therefore manually checking code validity and results isn't easy either.
While I am not yet sure if it now indeed does work for 100%. It seams to work better then before. But i think this code is far from perfect, as i walk the set of trees twice. With a pre-sort and a final-sort.
static SortedDictionary<int, List<int>> NewSort(SortedDictionary<int, List<int>> trees)
{
bool debugmode = false;
//pre sort
List<int> indexTree = new List<int>();
foreach (KeyValuePair<int, List<int>> tree in trees)
{
indexTree.Add(tree.Key);
}
for (int i = 0; i < indexTree.Count; i++)
{
int cursor = 1;
List<int> tree = new List<int>();
int index = indexTree[i];
tree = trees[index];
while ((tree !=null)&& (cursor<tree.Count) )
{
int c = tree[cursor ];
if (trees.ContainsKey(c))
{
if (trees[c] != null)
{
List<int> u = new List<int>();
u = trees[c];
tree.AddRange(u);
tree.Sort();
trees[index] = tree;
trees[c] = null;
}
}
cursor++;
}
}
for (int i = trees.Count; i > 0; i--)
{
int c = indexTree[i - 1];
if (trees[c] == null)
{ trees.Remove(indexTree[i - 1]); }
else
{
trees[c] = trees[c].Distinct().ToList(); //removing all duplicates
}
}
indexTree = new List<int>();
//resort.
foreach (KeyValuePair<int, List<int>> tree in trees)
{
indexTree.Add(tree.Key);
if(debugmode) Console.WriteLine("* " +DumpIntList(trees[tree.Key]));
}
for (int i = (indexTree.Count); i > 0; i--)
{
if (debugmode) Console.WriteLine(i.ToString());
List<int> tree = new List<int>();
tree = trees[indexTree[i-1]];
for (int j = 0; j < tree.Count; j++)
{
int c = tree[j];
for (int k = (i - 2); k > 0; k--)
{
List<int> compareTree = new List<int>();
compareTree = trees[indexTree[k]];
if (compareTree.IndexOf(c) != -1) // found !
{
if (debugmode) Console.Write("found " + c.ToString() + " from ");
if (debugmode) Console.WriteLine(DumpIntList(tree) + " in (" + DumpIntList(compareTree)+ ")");
tree.Remove(c); // or we would create a duplicate
compareTree.AddRange(tree);
compareTree = compareTree.Distinct().ToList(); //removing doubles again, doubles as side effect of merging
compareTree.Sort();
trees.Remove(indexTree[i - 1]);
trees[indexTree[k]] = compareTree;
}
}
}
}
return trees;
}
Maybe what i try to do inst that clear to some, what i try to do here is that i try to look if series have overlapping numbers, and if so merge them. Where each series is always sorted and starts with the lowest number of that series. As I found recently this might be a version of the UnionFind problem. The problem also appears in Blob detection, and finding what web pages are link eachother in a set of webpages. (but my data is a strange set lab measurements).
The difficulty is that there are lots of series, and it might be not directly clear if they are connected
1-3-4
4-7-9
11-12
would result in 2 series :
1) 1-3-4-7-9
2) 11-12
But after you add series 12-3 then it would all become one series.
Some more test data :
2 - 2,3,5,6,7 // note my data is always ordered like this
5 - 5,7,9 // dictionary starts with lowest key
11 - 11,12,12,27,30,31 // each list inside a tree key
22 - 22,27 // is ordered low to high
23 - 23,25 // lowest int, equals the dict key.
28 - 28,30
34 - 34
Output using above function
2 - 2,3,5,6,7,9
11 - 11,12,22,27,28,30,31
23 - 23,25
34 - 34
So while the code seams to work now, I highly doubt its ideal code, i irritate the trees set twice. And so i wonder if better solutions are possible. It might also be that the code doesnt do what i hope it todo; as i'm still testing it.
Well i decreased the size of the function and improved it. It should now be a single irritation over all trees. Unless someone knows a better answer, i think its "the" answer. The code has been tested to work with larger sets, and i couldnt spot errors.
static SortedDictionary<int, List<int>> NewSort(SortedDictionary<int, List<int>> trees)
{
bool debugmode = false;
//pre sort
List<int> indexTree = new List<int>();
foreach (KeyValuePair<int, List<int>> tree in trees)
{
indexTree.Add(tree.Key);
if(debugmode) Console.WriteLine("* " +DumpIntList(trees[tree.Key]));
}
for (int i = (indexTree.Count); i > 0; i--)
{
if (debugmode) Console.WriteLine(i.ToString());
List<int> tree = new List<int>();
tree = trees[indexTree[i-1]];
for (int j = 0; j < tree.Count; j++)
{
int c = tree[j];
for (int k = (i - 2); k > -1; k--) // k can be 0 but i can minimally be 1
{
List<int> compareTree = new List<int>();
compareTree = trees[indexTree[k]]; // for loop > checking all trees
if (compareTree.IndexOf(c) != -1) // found !
{
if (debugmode) Console.Write("found " + c.ToString() + " from ");
if (debugmode) Console.WriteLine(DumpIntList(tree) + " in (" + DumpIntList(compareTree)+ ")");
// tree.Remove(c); // or we would create a duplicate
compareTree.AddRange(tree);
compareTree = compareTree.Distinct().ToList();
compareTree.Sort();
trees.Remove(indexTree[i - 1]);
trees[indexTree[k]] = compareTree;
j =tree.Count; //break from more checks. maybe dirty code but it increases speed
break; //break checking loop on all trees for current tree
}
}
}
}
return trees;
}