Search code examples
c#optimizationmultidimensional-arraydata-processingquadtree

Getting N x N dimension data from quad tree is very slow in c#


I am using quad-tree structure for my data processing application in c#, it is similar to hashlife algorithm. Getting data N x N (eg. 2000 x 2000) dimension data from quad-tree is very very slow.
how can i optimize it for extracting large data from quad tree.

Edit: Here is the code i used to extract the data in recursive manner

public int Getvalue(long x, long y)
{
    if (level == 0)
    {
        return value;
    }
    long offset = 1 << (level - 2);
    if (x < 0)
    {
        if (y < 0)
        {
            return NW.Getvalue(x + offset, y + offset);
        }
        else
        {
            return SW.Getvalue(x + offset, y - offset);
        }
    }
    else
    {
        if (y < 0)
        {
            return NE.Getvalue(x - offset, y + offset);
        }
        else
        {
            return SE.Getvalue(x - offset, y - offset);
        }
    }
}

outer code

    int limit = 500;
    List<int> ExData = new List<int>();
    for (int row = -limit; row < limit; row++)
    {
        for (int col = -limit; col < limit; col++)
        {
            ExData.Add(Root.Getvalue(row, col));
            //sometimes two dimension array
        }
    }

Solution

  • A quadtree or any other structure isn't going to help if you're going to visit every element (i.e. level 0 leaf node). Whatever code gets the value in a given cell, an exhaustive tour will visit 4,000,000 points. Your way does arithmetic over and over again as it goes down the tree at each visit.

    So for element (-limit,-limit) the code visits every tier and then returns. For the next element it visits every tier and then returns and so on. That is very labourious.

    It will speed up if you make the process of adding to the list itself recursively visiting each quadrant once.

    NB: I'm not a C# programmer so please correct any errors here:

    public void AppendValues(List<int> ExData) {
        if(level==0){ 
            ExData.Add(value);
        } else{
            NW.AppendValues(ExData);
            NE.AppendValues(ExData);
            SW.AppendValues(ExData);
            SE.AppendValues(ExData);
        }
    }  
    

    That will append all the values though not in the raster-scan (row-by-row) order of the original code!

    A further speed up can be achieved if you are dealing with sparse data. So if in many cases nodes are empty or even 'solid' (all zero or one value) you could set the nodes to null and then use zero or the solid value.

    That trick works well in Hashlife for Conway Life but depends on your application. Interesting patterns have large areas of 'dead' cells that will always propagate to dead and rarely need considering in detail.

    I'm not sure what 25-40% means as 'duplicates'. If they aren't some fixed value or are scattered across the tree large 'solid' regions are likely to be rare and that trick may not help here.

    Also, if you actually need to only get the values in some region (e.g. rectangle) you need to be a bit cleverer about how you work out which sub-region of each quadrant you need using offset but it will still be far more efficient than 'brute' force tour of every element. Make sure the code realises when the region of interest is entirely outside the node in hand and return quickly.

    All this said if creating a list of all the values in the quad-tree is a common activity in your application, a quad-tree may not be the answer you need. A map simply mapping (row,col) to value is pre-made and again very efficient if there is some common default value (e.g. zero).

    It may help to create an iterator object rather than add millions of items to a list; particularly if the list is transient and destroyed soon after.

    More information about the actual application is required to understand if a quadtree is the answer here. The information provided so far suggests it isn't.