Search code examples
pythonregexregex-lookarounds

How to extract first floating numbers appearing after a word?


I'm trying to build an application for text extraction use case but I was not able to extract exact price from it.

I have a text like this,

string1 = 'Friscos #8603\n8100 E. Orchard Road\nGreenwood Village, Colorado 80111\n2013-11-02\nTable 00\nGuest\n1 Oysters 1/2 Shell #1\n1 Crab Cake\n1 Filet 1602 Bone In\n1 Ribeye 22oz Bone In\n1 Asparagus\n1 Potato Au Gratin\n$17.00\n$19.00\n$66.00\n$53.00\n$12.00\n$11.50\nSub Total\nTax\n$178.50\n$12.94\nTotal\n$191.44\n'
string2 = 'Berghotel\nGrosse Scheidegg\n3818 Grindelwald\nFamilie R. Müller\nRech. Nr. 4572\nBar\n30.07.2007/13:29:17\nTisch 7/01\nNM\n#ರ\n2xLatte Macchiato à 4.50 CHF\n1xGloki\nà 5.00 CHF\n1xSchweinschnitzel à 22.00 CHF\n1xChässpätzli à 18.50 CHF\n#ರ #ರ #1ರ\n5.00\n22.00\n18.50\nTotal:\nCHF\n54.50\nIncl. 7.6% MwSt\n54.50 CHF:\n3.85\nEntspricht in Euro 36.33 EUR\nEs bediente Sie: Ursula\nMwSt Nr. : 430 234\nTel.: 033 853 67 16\nFax.: 033 853 67 19\nE-mail: [email protected]\n'

I want to extract the price that appearing after the word total using regex but I was only able to extract all floating numbers. Also do note some-times you may also see words such as sub total but I only need price that appears after the word total. Also sometimes after total there may occur other words as well. So Regex should match word total and extract the floating numbers that appears next to it.

Any help is appreciated.

This is what I've tried,

re.findall("\d+\.\d+", string1) # this returns all floating numbers.

Solution

  • You can try

    (?<=\\nTotal)\:?\D+([\d\.]+)
    

    Demo