I have a human dictionary file that looks like this in eng.dic
(image that there is close to a billion words in that list). And I have to run different word queries quite often.
apple
pear
foo
bar
foo bar
dictionary
sentence
I have a string let's say "foo-bar", is there a better (more efficient way) of searching through that file to see whether it exist, if it return exist, if it doesnt exist, append the dictionary file
dic_file = open('en_dic', 'ra', 'utf8')
query = "foo-bar"
wordlist = list(dic_file.readlines().replace(" ","-"))
en_dic = map(str.strip, wordlist)
if query in en_dic:
return 1
else:
print>>dic_file, query
Is there any in-built search functions in python? or any libraries that i can import to run such searches without much overheads?
As I already mentioned, going through the whole file when its size is significant, is not a good idea. Instead you should use established solutions and:
Storing data in database is really a lot more efficient than trying to reinvent the wheel. If you will use SQLite, the database will be also a file, so the setup procedure is minimal.
So again, I am proposing storing words in SQLite database and querying when you want to check if the word exists in the file, then updating it if you are adding it.
To read more on the solution see answers to this question: