Search code examples
nested-loopsintersystems-cache

Nested loops for large datasets


As part of the project I'm working on I have to build various reports. To build these reports I gather data from very large data sets. As such I'm trying to optomize my loop structures to minimize checks inside the inner loops.

One of the globals (key-value type of data set) I'm often using has roughly the following structure:

dataStatus->inputDate->state->region->street->data

I have to loop through each of the preceding sections to get to the data that I'm after. The problem is that while I always have the state (it is a required field in the input screen), the region and street may be omitted. If the street is left blank I must loop through all the streets in the region. Likewise, if the region is left blank I must loop through all the regions within the state and again through all the streets in each of the regions.

The easy solution would be something like this:

loop through dataStatuses {
    loop through dates {
        if street is set {
            get data
        } else if region is set {
            loop through streets {
                get data
            }
        } else {
            loop through regions {
                loop through streets {
                    get data
                }
            }
        }
    }
}

However, I would really like a way to skip checks inside. The best solution I could come up with so far is something like this:

if street is set {
    loop through dataStatuses {
        loop through dates {
            get data
        }
    }
} else if region is set {
    loop through dataStatuses {
        loop through dates {
            loop through streets {
                get data
            }
        }
    }
} else {
    loop through dataStatuses {
        loop through dates {
            loop through regions {
                loop through streets {
                    get data
                }
            }
        }
    }
}

Is there a more elegant solution than this, maybe something that would cater for n levels before reaching the data?

Any help will be much appreciated.


Solution

  • I have come up with a solution that works well for me. This might not be the perfect I was initially looking for, but it works for me for these reasons:

    1. When I'm done with this work other programmers will periodically work on the reports and I want to make their life as easy as possible in this regard.
    2. A report-generator will be nice, but not the most productive thing to do at the moment.

    I believe my solution is fairly easy to read and maintain without sacrificing too much performance:

    GatherData
        set command="do LoopThroughRegions(status,date,state,.dataCollection)"
        if ($length(street)>0){
            set command="do LoopThroughData(status,date,state,region,street,.dataCollection)"
        } elseif ($length(region)>0){
            set command="do LoopThroughStreets(status,date,state,region,.dataCollection)"
        }
    
        set date=fromDate-1,status=""
        for{
            set status=$order(^GLOBAL(status)) quit:status=""
            for{
                set date=$order(^GLOBAL(status,date)) quit:((date>toDate)||(date=""))
                xecute (command)
            }
        }
     quit
    
    LoopThroughRegions(status,date,state,dataCollection)
        set currentRegion="" 
        for{
            set currentRegion=$order(^GLOBAL(status,date,region,currentRegion)) quit:currentRegion=""
            do LoopThroughStreets(status,date,state,currentRegion,.dataCollection)
        }
     quit
    
    LoopThroughStreets(status,date,state,region,dataCollection)
        set currentStreet=""
        for{
            set currentStreet=$order(^GLOBAL(status,date,state,region,currentStreet)) quit:currentStreet=""
            do LoopThroughData(status,date,state,region,currentStreet,.dataCollection)
        }
     quit
    
    LoopThroughData(status,date,state,region,street,dataCollection)
        set dataItem="" 
        for{
            set dataItem=$order(^GLOBAL(status,date,state,region,street,dataItem)) quit:dataItem=""
            // Do stuff
        }
     quit
    

    Unless a better solution is provided I will select my own answer for future reference. Hopefully it might even help someone else as well.