Search code examples
repast-simphony

Repast: slow initialization reason


I have found it's very much slow to initialize my model. IT TAKES 40 SECONDS TO FINISH!

my codes contains two major parts: 1) a CSV data reader will first run to load the data, which takes less than 1 second to finish reading and processing 35000+ lines (see the first part code below); 2) the agent and edges are initialized subsequently. In particular, the edges initialization will make use of the loaded data in CSV reader (see the second part code below).

First part: CSVReader code

public class DataReader {

    private String csvFile;
    private List<String> sub = new ArrayList<String>();
    private List<List> master = new ArrayList<List>();


    public void ReadFromCSV(String csvFile) {

        String line = "";
        String cvsSplitBy = ",";

        try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
            System.out.println("Header " + br.readLine());
            while ((line = br.readLine()) != null) {

                // use comma as separator
                String[] list = line.split(cvsSplitBy);
//                System.out.println("the size is " + country[1]);
                for (int i = 0; i < list.length; i++) {
                    sub.add(list[i]);
                }
                List<String> temp = (List<String>) ((ArrayList<String>) sub).clone();
//                master.add(new ArrayList<String>(sub));
                master.add(temp);
                sub.removeAll(sub);
            }

        } catch (IOException e) {
            e.printStackTrace();
        }

        System.out.println(master);
    }

    public List<List> getMaster() {
        return master;
    }

}

This is the input file used by CSVReader:

enter image description here

Second part: edge (route) initialization code. I suspect it's the query loop which consumes much of the time for initialization:

//      add route network
        Network<Object> net = (Network<Object>)context.getProjection("IntraCity Network");
        IndexedIterable<Object> local_hubs = context.getObjects(LocalHub.class);
        for (int i = 0; i <= CSV_reader_route.getMaster().size() - 1; i++) {
            String source = (String) CSV_reader_route.getMaster().get(i).get(0);
            String target = (String) CSV_reader_route.getMaster().get(i).get(3);
            double dist = Double.parseDouble((String) CSV_reader_route.getMaster().get(i).get(6));
            double time = Double.parseDouble((String) CSV_reader_route.getMaster().get(i).get(7));

            Object source_hub = null;
            Object target_hub = null;
            Query<Object> source_query = new PropertyEquals<Object>(context, "hub_code", source);
            for (Object o : source_query.query()) {
                if (o instanceof LocalHub) {
                    source_hub = (LocalHub) o;
                }
                if (o instanceof GatewayHub) {
                    source_hub = (GatewayHub) o;
                }
            }

            Query<Object> target_query = new PropertyEquals<Object>(context, "hub_code", target);
            for (Object o : target_query.query()) {
                if (o instanceof LocalHub) {
                    target_hub = (LocalHub) o;
                }
                if (o instanceof GatewayHub) {
                    target_hub = (GatewayHub) o;
                }
            }

            if (net.getEdge(source_hub, target_hub) == null) {
                Route this_route = (Route) net.addEdge(source_hub, target_hub);
                context.add(this_route);
                this_route.setDist(dist);
                this_route.setTime(time); }
            }



        }

UPDATE: according to my test, I found this line will dramatically slow down the initialization process.

context.add(this_route);

Without this line it took only 3 seconds to finish. With this line the model took 20 seconds! What is the underlying mechanism of context.add() ? How to solve and improve this problem?


Solution

  • When you add the edges to the context, the queries become that much more computationally expensive as the search space in the context becomes larger. So, perhaps not adding the edges to the context in the csv reader loop would help. You could create the edge as now, but add it to a list rather than the context. Then when the reader loop is finished iterate through that list and add the edges to the context.

    If this doesn't help, then at least we know that there's an additional side effect in adding to the context that we can try to track down.