Search code examples
rstringfuzzy-comparisonstringdist

Using stringsim in stringdist


I'm using the package stringdist to compare some vectors of strings but I keep getting a different answer than what I think I should when I try to test out the package.

I want to do this:

stringsim('PANDIAN', 'PANIAN', method="lv")
[1] 0.8571429

To 2 columns in a dataframe

stringsim(testdf.lv$Last[1], testdf.lv$matchedname[1], method="lv")

But I get this error:

Error in UseMethod("lengths") : 
  no applicable method for 'lengths' applied to an object of class "factor"

I need to be able to do this because ideally, I would replace the row numbers with an i and run it in a loop. Is this even possible? I tried looking for similar errors but the other questions were not very helpful.


Solution

  • So thanks to @MrFlick. It turns out the data I was using in the column:

    testdf.lv$Last
    

    Was mistakenly characterized as a factor variable instead of character. Changing the that column to a character with the following:

    testdf.ld$Last <- as.character(testdf.ld$Last)
    

    Fixed the error and I was able to rewrite the code into a for loop to go through the entire dataframe.