Search code examples
cluster-analysisdata-miningdbscanelki

Incremental Clustering with ELKI


Im using the ELKI-Library and have implemented clustering using DBSCAN, but since the dataset im working with grows over time I want to use a incremental Clusting-Algorithm. I found this paper about an incremental DBSCAN-Algorithm. The paper says that the Algorithm was implemented with ELKI and that this implementation was contributed to ELKI. But unfortunately I cant figure out how to use DBSCAN incrementally.


Solution

  • I don't think we have received this contribution to ELKI yet.

    Try contacting the authors. We'd appreciate such a contribution.

    The GriDBSCAN and ParallelDBSCAN implementations in ELKI can be modified to perform an incremental DBSCAN clustering as long as you only have insertions, not removals.

    However, to build a nice incremental DBSCAN API is much harder: when and how should "results" be reported? Regular DBSCAN has a clearly defined result, but incremental DBSCAN? How is data stored inbetween?

    If your data set keeps on growing over time, you may need to change parameters, too. For example, decrease epsilon or increase minpts. Depending on your rate of updates, rerunning DBSCAN may be just as effective.