Search code examples
pythonregextrim

Trying to get rid of trailing spaces in python regex


I have a CSV that I am working with in Python REGEX. I need to extract values in the final field of the CSV, but I am having trouble with it in regex101.com (fantastic page, by the way).

A couple of example rows:

,11/12/2017,00-87-67 34849444,-27.00,ITEMRECEIVED,H2G2                   929613292012071217 REF
,02/01/2018,00-87-68 58493922,-1110.79,ITEMSENT,MIL P01  WOOLLIES     9221234545         DEG

I need to capture the final "MEMO" field on the end with this regex:

(?:[^\,]*\,){5}(?P<CompanyName>[^\s].*)\s{4,19}(?P<Reference>\S{1,18})\s{1,11}(?P<Type>\w{3})

What I am getting is:

CompanyName           Reference             Type
-----------           -----------           ----
'H2G2               ' '929613292012071217'  'REF'
'MIL P01  WOOLLIES  ' '9221234545'          'DEG'

It doesn't seem much, but how can I get the regex to trim the trailing spaces in the CompanyName, so that I get the following instead, please?

CompanyName         Reference             Type
-----------         -----------           ----
'H2G2'              '929613292012071217'  'REF'
'MIL P01  WOOLLIES' '9221234545'          'DEG'

Thanks in advance,

QuietLeni


Solution

  • Change your regex to:

    (?:[^\,]*\,){5}(?P<CompanyName>[^\s].*\S)\s{4,19}(?P<Reference>\S{1,18})\s{1,11}(?P<Type>\w{3})
    

    Adding \S basically means that after the last non-whitespace character there have to be no more whitespaces.