I am making inverted index.For this purpose I am taking values from a file.Each value of a file is in form:
document_Id'\t'term_Id'\t'pos_1'\t'pos_2...'\t'pos_n
This is a forward index representation.I want to convert it into inverted index which should look like
term_Id'\t'"doc_Id:pos1,pos2...posn""doc_Id:pos1,pos2...posn"
For that purpose I am using default dict of list type.This is my function:
nestedDict = defaultdict(lambda:defaultdict(list))
def getInfo(line):
global nestedDict
tokens = re.split(r'\t+',line)
docInfo = int(tokens[0]) #Set document Id
termId = int(tokens[1]) #Set Term Id
currentPosition = int(tokens[2])
nestedDict[str(termId)][str(docInfo)] = str(currentPosition)
if len(tokens) > 3 :
for i in range(3,len(tokens)):
position = int(tokens[i])-currentPosition
currentPosition = currentPosition + position
nestedDict[str(termId)][str(docInfo)].append(currentPosition)
It is giving me an error:Str has no method .append. I am new to python.Any help would be highly appreciated.
Your nested defaultdict
makes nestedDict[...][...]
be a list
, but then you assign a string to it. I don't think you need that assignment anyway: why not just let the loop handle all of the positions?