Search code examples
pythonsplitnlptokenizereadlines

How to split line in readlines and save them in different list?


this is my code

with open('file.txt', 'r') as source:
    # Indentation
    polTerm = [line.strip().split()[0] for line in source.readlines()]
    polFreq = [int(line.strip().split()[1]) for line in source.readlines()]

this is inside file.txt

anak 1
aset 3
atas 1
bangun 1
bank 9
benar 1
bentuk 1

I got the polTerm just like what I want:

['anak', 'aset', 'atas', 'bangun', 'bank', 'benar', 'bentuk']

but for the polFreq, instead of this:

['1', '3', '1', '1', '9', '1', '1']

what I got is blank list like this:

[ ]

anyone know why this happened? and how to fix this so I can get just like I what I want.


Solution

  • As Carcigenicate said, .readlines is a generator that returns a list. If you don't save that list in a variable, calling a generator a second time will return nothing, because the generator has been exhausted in your first call. What you want is this:

    with open("file.txt","r") as inf:
        # Now your lines list is saved in a global variable 
        # which can be used outside with open().
        # The .readlines generator is exhausted and won't return 
        # anything.
        raw = inf.readlines()
    
    polTerm = [line.strip().split()[0] for line in raw]
    polFreq = [int(line.strip().split()[1]) for line in raw]
    

    Pro tip: Learn to use pandas, specifically, pd.read_csv().