I want to make pairs of index
on the condition that the info of two columns
of the compared database are equal. Can this be implemented using the index
class of record linkage?
# dfg and dfm are databases that both contain the columns 'N_name' and 'N_cp'
import recordlinkage as rl
indexer_try = rl.Index()
indexer_try.block('N_name','N_name','N_cp','N_cp')
candidate_links = indexer_try.index(dfg, dfm)
I expected the class to create the multiindex that contains indexes that match these criterion.
Instead I got, init() takes from 1 to 3 positional arguments but 5 were given
It is necessary to include the columns as an array
# Indexation step
import recordlinkage as rl
indexer = rl.Index()
indexer.block(['N_name'],['N_name']) # 25k
indexer.block(['N_address', 'N_cp'],['N_address','N_cp']) #211k
indexer.block('latlng', 'latlng') # 320k
candidate_links = indexer.index(dfg, dfm)