Search code examples
powerquerym

Extract decimal numbers from text column


I am looking to extract the numbers of hours in a column TXT however PQ is unable to find a pattern due to the inconsistency in the structure of text. I have used Columns by Example but hasnt helped.

Is there a M code or combination of M code I can use?

Sample data:

TXT
Contract Staff w/e 26.06.21- Carer 9.5hrs- MNL
Contract staff w/e02.07.21 - Physio- 19.34hrs- ARK
Contract Staff w/e 04.07.21 - RN 13.25- MNL
Contract Staff w/ e 04.07.21 - carer 6- MNL
Contract Staff w/e25.06.21 - carer 12.5 - KLTL
Contract Staff w/e04.07.21 - RN 34hrs- KLTL
Contract Staff w/e04.07.21 - AIN 25.5hrs- KLTL
Contract Staff w/e26.06.21- Carer - 6hrs- MNL
Contract Staff w/e11.07.21 - 6hrs- MNL
Contract Staff wie 24.06.21 - Carer 8hrs - ARK
Contract Staff w/e 16.06.21 - EN 5.50- GL
Contract Staff w/e 16.06.21 - RN 5.25- GL
Contract Staff w/e 11.07.21 - RN 22hrs- MNL
Contract Staff w/e 11.07.21 - carer 27.75- MNL
Contract Staff w/e04.07.21 - RN 22.25hrs- KLTL
Contract Staff w/e04.07.21 - AIN 69.67 - KLTL
Contract Staff w/e04.07.21 - RN 5.75- KLTL
Contract Staff w/e10.07.21 - RN 16hrs- KLTL
Contract Staff w/e10.07.21- Carer 6hrs- KLTL
Contract Staff w/e11.07.21 - AIN 38.50- KLTL
Contract Staff w/e18.07.21 - RN 46-KLTL
Contract Staff w/e18.07.21 - AIN 17 -KLTL
Contract Staff w/e18.07.21 - Cleaner 24.50 -KLTL
Contract Staff w/e18.07.21 - AIN 19.5- MNL
contract staff w/16.07.21 - RN23.25hrs - MNL
contract staff WIE 25.07.21 - carer - 42.25hrs- MNL
contract staff w/E 18.07.21 - AIN 24.5 - KLTL
contract staff WIE 18.07.21 - Domestic 6- KLTL
Contract Staff w/e 13.07.21 - RN 6.25hrs- KLTL
Contract Staff w/e25.07.21 - RN 19.5- KLTL
Contract Staff w/e25.07.21 - AIN 13.5- KLTL
Contract Staff w/e18.07.21 - AIN 6hrs- ARK

Solution

  • You don't need regex for this:

    1. Add column, extracting text after "-"
    2. Remove all letters and remaining "-"
    3. Convert text to number
    #"Inserted Text After Delimiter" = Table.AddColumn(
        Source, 
        "hours", 
        each Text.AfterDelimiter([TXT], "-"), type text
    ),
    #"Remove Letters" = Table.TransformColumns(
        #"Inserted Text After Delimiter", 
        {{"hours", each Text.Remove(_, {"A".."z", "-"})}}
    ),
    #"Changed Type to number" = Table.TransformColumnTypes(
        #"Remove Letters",
        {{"hours", type number}}
    )