Best solution on partial string search with Pandas

I work with very large data sets (1.5gb+) and do partial string searches on it.

I was able to write a script for my work, but it takes too long:

fhand = open('C:/Users/promotor/Documents/tce-sagres/TCE-PB-SAGRES-Empenhos_Esfera_Municipal.txt','r')
pergunta = raw_input('Pesquisa: ')
fresult = open('resultado.csv','w')
for line in fhand :
    #linha = linha + 0.001 
    #update_progress(int(linha)*1000)
    if pergunta in line : 
        print line
        fresult.write(line)  
print "terminado."""

I was wondering if there would be a faster way to do that on Pandas. I tried str.contains, but I could only search on a column. I was wondering if there would be a faster way. I tried "str.contains" but I could only search on only one column.

Best regards.

Solution

You are iterating over a for loop and this is what is probably taking a lot of time. I recommend reading the whole file as a string and then using regex to match your pattern.

Try the following code,

import re
with open(your_file_name,'r') as f:
    lines=f.read()
name = input('pattern :')
pattern_to_match = r'(?<=\n).*%s.*(?=\n)'%name
matched_pattern = re.findall(pattern_to_match, lines, re.IGNORECASE)
print (matched_pattern)