I'm trying to retrieve ids of a specific set of keywords from pubmed using the following standard code,
import os
from Bio import Entrez
from Bio import Medline
#Defining keyword file
keywords_file = "D:\keywords.txt"
# Splitting keyword file into a list
keyword_list = []
keyword_list = open(keywords_file).read().split(',')
#print keyword_list
# Creating folders by keywords and creating a text file of the same keyword in each folder
for item in keyword_list:
create_file = item +'.txt.'
path = r"D:\Thesis"+'\\'+item
#print path
if not os.path.exists(path):
os.makedirs(path)
#print os.getcwd()
os.chdir(path)
f = open(item+'.txt','a')
f.close()
# Using biopython to fetch ids of the keyword searches
limit = 10
def fetch_ids(keyword,limit):
for item in keyword:
print item
print "Fetching search for "+item+"\n"
#os.environ['http_proxy'] = '127.0.0.1:13828'
Entrez.email = 'A.N.Other@example.com'
search = Entrez.esearch(db='pubmed',retmax=limit,term = '"'+item+'"')
print term
result = Entrez.read(search)
ids = result['IdList']
#print len(ids)
return ids
print fetch_ids(keyword_list,limit)
id_res = fetch_ids(keyword_list,limit)
print id_res
def write_ids_in_file(id_res):
with open(item+'.txt','w') as temp_file:
temp_file.write('\n'.join(ids))
temp_file.close()
write_ids_in_file(id_res)
In a nutshell what I'm trying to do is to create folders with the same name as each of the keywords, create a text file within the folder, fetch the ids from pubmed through the code and save the ids in the text files. My program worked fine when I initially tested it, however, after a couple of tries it started throwing me the target machine actively refused connection error. Some more details that could be useful,
header information
URL
host = '127.0.0.1:13828'
I know that this question has been asked many times with the response that the port is not listening but what I want to know is if this is my issue as well, then how do get the application to work on this specific port. I've already gone to my firewall settings and opened a port 13828 but I'm not sure what to do beyond this. If this is not the case, what could potentially be a work around solution?
Thanks!
You need search.close()
after result = Entrez.read(search)
. Check official instruction here. http://biopython.org/DIST/docs/api/Bio.Entrez-module.html
Shut down port or TCP due to too many open connections is a normal behavior for a public website.