Search code examples
pythonpython-3.xlisttexttext-files

How to extract data from a list with addtional spaces in between them in python


The code is trying to extract from a file: (format: group, team, val1, val2). However, some results are correct if there is no additional space and produces wrong result in lines with additional spaces in between.

data = {}
with open('source.txt') as f:
    for line in f:
        print ("this is the line data: ", line)
        
        needed = line.split()[0:2]
        print ("this is what i need: ", needed)

source.txt #-- format: group, team, val1, val2

alpha diehard group 1 54,00.01
bravo nevermindteam 3 500,000.00
charlie team ultimatum 1 27,722.29 ($250.45)
charlie team ultimatum 10 252,336,733.383 ($492.06)
delta beyond-imagination 2 11 ($10)
echo double doubt 5 143,299.00 ($101)
echo double doubt 8 145,300 ($125.01)
falcon revengers 3 0.1234
falcon revengers 5 9.19
lima almost done 6 45.00181 ($38.9)
romeo ontheway home 12 980

I am trying to just extract the values before val1. #-- group, team

alpha diehard group
bravo nevermindteam
charlie team ultimatum
delta beyond-imagination
echo double doubt
falcon revengers
lima almost done
romeo ontheway home

Solution

  • Use regex.

    import regex as re
    with open('source.txt') as f:
       for line in f:
           found = re.search("(.*?)\d", line)
           needed = found.group(1).split()[0:3]
           print(needed)
    

    Output:

    ['alpha', 'diehard', 'group']
    ['bravo', 'nevermindteam']
    ['charlie', 'team', 'ultimatum']
    ['charlie', 'team', 'ultimatum']
    ['delta', 'beyond-imagination']
    ['echo', 'double', 'doubt']
    ['echo', 'double', 'doubt']
    ['falcon', 'revengers']
    ['falcon', 'revengers']
    ['lima', 'almost', 'done']
    ['romeo', 'ontheway', 'home']