python list readfile splice dna-sequence

Read a text file into python by splitting the file into list items according to a set of characters

I have a plain text file with the following contents:

@M00964: XXXXX
YYY
+
ZZZZ 
@M00964: XXXXX
YYY
+
ZZZZ
@M00964: XXXXX
YYY
+
ZZZZ

and I would like to read this into a list split into items according to the ID code @M00964, i.e. :

['@M00964: XXXXX
YYY
+
ZZZZ' 
'@M00964: XXXXX
YYY
+
ZZZZ'
'@M00964: XXXXX
YYY
+
ZZZZ']

I have tried using

in_file = open(fileName,"r")
sequences = in_file.read().split('@M00964')[1:]
in_file.close()

but this removes the ID sequence @M00964. Is there any way to keep this ID sequence in?

As an additional question is there any way of maintaining white space in a list (rather than have /n symbols).

My overall aim is to read in this set of items, take the first 2, for example, and write them back to a text file maintaining all of the original formatting.

Solution

Specific to your example, can't you just do something as follows:

in_file = open(fileName, 'r')
file = in_file.readlines()

new_list = [''.join(file[i*4:(i+1)*4]) for i in range(int(len(file)/4))]
list_no_n = [item.replace('\n','') for item in new_list]

print new_list
print list_no_n

[EXPANDED FORM]

new_list = []
for i in range(int(len(file)/4)): #Iterates through 1/4 of the length of the file lines.
                                  #This is because we will be dealing in groups of 4 lines
    new_list.append(''.join(file[i*4:(i+1)*4])) #Joins four lines together into a string and adds it to the new_list

[Writing to new file]

write_list = ''.join(new_list).split('\n')
output_file = open(filename, 'w')
output_file.writelines(write_list)