Search code examples
pythonregexenumerate

how to use enumerate with regex(findall) in python?


I have a txt file as follows,

#onetwothree.txt
>one 
QWERTYUIOP
>two
ASDFGHJKL
>three
ZXCVBNM
...

and I want to split that txt file into several files as follows,

#one.txt
>one
QWERTYUIOP

and

#two.txt
>two
ASDFGHJKL

and

#three.txt
>three
ZXCVBNM

here is the code I worte,

import re
with open("onetwothree.txt") as file:
 name=re.findall(r'\>[^\n]+',file.read())
 sequence=re.findall(r'name[ind][^/n]+' for ind in enumerate(name), file.read())
          .
          .
          .

I know that there is something wrong in following part.

sequence=re.findall(r'name[ind][^/n]+' for ind in enumerate(name), file.read())

I want to make a list using re.findall,enumerate and following list is what I want to get.

>>>print (seq)
["QWERTYUIOP","ASDFGHJKL","ZXCVBNM"]

how can I fix this codesequence=re.findall(r'name[ind][^/n]+' for ind in enumerate(name), file.read()) right?


Solution

  • First of all, you can't read a file twice using read(), second time you call it, it returns an empty string.

    Also, i think you got the wrong understanding of re.findall. It takes only 2 parameters (regex,string).

    You can accomplish the task in one go, without calling findall twice.

    s = '''>one 
    QWERTYUIOP
    >two
    ASDFGHJKL
    >three
    ZXCVBNM
    ''' # replace this with file.read()
    
    res = re.findall(">([^\n]+)\n(\w+)",s)     #each regex in paren constitutes a group
    print(res) 
    #[('one ', 'QWERTYUIOP'), ('two', 'ASDFGHJKL'), ('three', 'ZXCVBNM')]