Search code examples
rdplyrstringdistrecord-linkage

Displaying corresponding values in data frame in R


Please check the code below, I have created a data frame using three variables below, the variable "y123" computes the similarity between columns a2 with a1. The variable "y123" gives me total 16 values where every a1 value gets compared with a2. My need is that when a particular "a1" value is compared with a particular "a2" value, I want the corresponding "a3" value next to "a2" be displayed besides. So the result should be a data frame with column y123 and a second column with corresponding "a3" column appearing four times i.e 16 values. Thanks and please help.

library(stringdist)
library(RecordLinkage)
a1 = c(103,120,142,153)
a2 = c(113,453,142,102)
a3 = c("a1","b1","c1","d1")
a1 = as.character(a1)
a2 = as.character(a2)
a3 = as.character(a3)
a123 = data.frame(a1,a2,a3)
y123 = sapply(a1, function(i) RecordLinkage::levenshteinSim(i,a2))
b1 = c(y123)
b1

I need something list this:

new_data = data.frame(b1,new_column)

Solution

  • I think this is what you want. I modified your sapply function:

    data.frame(y123 = c(y123), a3 = rep(a3, times = length(a3)))
    #        y123 a3
    #1  0.6666667 a1
    #2  0.3333333 b1
    #3  0.3333333 c1
    #4  0.6666667 d1
    #5  0.3333333 a1
    #6  0.0000000 b1
    #7  0.3333333 c1
    #8  0.3333333 d1
    #9  0.3333333 a1
    #10 0.0000000 b1
    #11 1.0000000 c1
    #12 0.6666667 d1
    #13 0.6666667 a1
    #14 0.6666667 b1
    #15 0.3333333 c1
    #16 0.3333333 d1