Input:
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
ID information2
Aa information2-1
Ba information2-2
Ca information2-3
Da information2-4
//
Expected output:
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
Result:
ID information1
ID information1
Aa information1-1
ID information1
Aa information1-1
Ba information1-2
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
Result:
Code:
word = 'Homo sapiens'
with open(input_file, 'r') as input, open(output_file, 'w') as output:
list_block = []
str_block = ""
for line in input:
if not ("//" in line):
str_block += line
elif "//" in line:
if word in str_block:
list_block.append(str_block)
str_block = ""
output.write(str_block)
I have an input file which has blocks of information based on a 'double slash'. I want to extract only blocks containing 'Homo sapiens' from among several blocks. When I tried to parse the data with my code, I got an issue like 'Result'. Is there a way I can do with my code?
As your blocks are delimited by '//', it will be much easier to read the entirety of the file, and then split it according to this pattern. That will create the list of blocks you need, and after that the solution is pretty straightforward. Here is an example which produces the desired output.
word = 'Homo sapiens'
with open(input_file, 'r') as fi, open(output_file, 'w') as fo:
for block in fi.read().split('//'): # read file, split in blocks and iterate over them
if word in block:
fo.write(block)
fo.write('//')