Search code examples
r-treedbscanelki

ELKI DBSCAN R* tree index


In MiniGUi, I can see db.index. How do I set it to tree.spatial.rstarvariants.rstar.RStartTreeFactory via Java code?

I have implemented:

params.addParameter(AbstractDatabase.Parameterizer.INDEX_ID,tree.spatial.rstarvariants.rstar.RStarTreeFactory);

For the second parameter of addParameter() function tree.spatial...RStarTreeFactory class not found

// Setup parameters:
            ListParameterization params = new ListParameterization();
            params.addParameter(
                    FileBasedDatabaseConnection.Parameterizer.INPUT_ID,
                    fileLocation);
            params.addParameter(AbstractDatabase.Parameterizer.INDEX_ID,
                RStarTreeFactory.class);

I am getting NullPointerException. Did I use RStarTreeFactory.class correctly?


Solution

  • The ELKI command line (and MiniGui; which is a command line builder) allow to specify shorthand class names, leaving out the package prefix of the implemented interface.

    The full command line documentation yields:

    -db.index <object_1|class_1,...,object_n|class_n>
        Database indexes to add.
        Implementing de.lmu.ifi.dbs.elki.index.IndexFactory
        Known classes (default package de.lmu.ifi.dbs.elki.index.):
        -> tree.spatial.rstarvariants.rstar.RStarTreeFactory
        -> ...
    

    I.e. for this parameter, the class prefix de.lmu.ifi.dbs.elki.index. may be omitted.

    The full class name thus is:

    de.lmu.ifi.dbs.elki.index.tree.spatial.rstarvariants.rstar.RStarTreeFactory
    

    or you just type RStarTreeFactory, and let eclipse auto-repair the import:

    params.addParameter(AbstractDatabase.Parameterizer.INDEX_ID,
        RStarTreeFactory.class);
    // Bulk loading static data yields much better trees and is much faster, too.
    params.addParameter(RStarTreeFactory.Parameterizer.BULK_SPLIT_ID, 
        SortTileRecursiveBulkSplit.class);
    // Page size should fit your dimensionality.
    // For 2-dimensional data, use page sizes less than 1000.
    // Rule of thumb: 15...20 * (dim * 8 + 4) is usually reasonable
    // (for in-memory bulk-loaded trees)
    params.addParameter(AbstractPageFileFactory.Parameterizer.PAGE_SIZE_ID, 300);
    

    See also: Geo Indexing example in the tutorial folder.