Search code examples
neo4jtime-complexitygraph-databaseslevenshtein-distanceedit-distance

Does the Levenshtein (Edit Distance) algorithm perform faster than O(n*m) in a native graph database?


Would the Levenshtein (Edit Distance) have better time complexity in a native graph database such as Neo4j than the current limit of O(n*m)? If so, why?


Solution

  • Since the implementations of apoc.text.levenshteinDistance and apoc.text.levenshteinSimilarity simply rely on org.apache.commons.text.similarity.LevenshteinDistance to do the calculation, the APOC library does not introduce any complexity improvements.

    In any case, such a calculation should just compare 2 strings of text and should not in any way rely on the graphical nature of the DB.

    And finally, it has been proven that the complexity cannot be improved (unless the Strong Exponential Time Hypothesis is wrong).