Search code examples
data-miningoutlierselki

Top n outliers in ResultWriter


I am dealing with high dimensional and large dataset, so i need to get just Top N outliers from output of ResultWriter. There is some option in elki to get just the top N outliers from this output?


Solution

  • The ResultWriter is some of the oldest code in ELKI, and needs to be rewritten. It's rather generic - it tries to figure out how to best serialize output as text.

    If you want some specific format, or a specific subset, the proper way is to write your own ResultHandler. There is a tutorial for writing a ResultHandler.

    If you want to find the input coordinates in the result,

    Database db = ResultUtil.findDatabase(baseResult);
    Relation<NumberVector> rel = db.getRelation(TypeUtil.NUMBER_VECTOR_VARIABLE_LENGTH);
    

    will return the first relation containing numeric vectors.

    To iterate over the objects sorted by their outlier score, use:

    OrderingResult order = outlierResult.getOrdering();
    DBIDs ids = order.order(order.getDBIDs());
    for (DBIDIter it = ids.iter(); it.valid(); it.advance()) {
      // Output as desired.
    }