Search code examples
regexgoogle-sheetssplitgoogle-sheets-formulare2

SUM multiple values after a substring within all cells in a column in Google Sheets


For an open source chat analyser in Google Sheets, I need to extract all numeric values after a substring (Example), then total them.

For example, if a cell contains Example1 another text 123 Example500 text, Example1 and Example500 should be extracted out, and their numeric values summed to 501.

This is complicated further by needing to obtain the total for a column of messages.


What I've tried already:

  • =REGEXEXTRACT(A1, "Example(\d+)"): This only extracts the first matching value, but works!
  • =SUM(SPLIT(A1, "Example")): This works for messages that only include my target string, but falls apart when other strings are included. The output could possibly be filtered to results that start with a number, but this is very messy and possibly a red herring.
  • CONCATENATEing all my cells together, then searching for numbers. This is error-prone due to additional numbers within messages.

Solution

  • try:

    =LAMBDA(x, REGEXEXTRACT(A1, "(\w+)\d+")&
     SUMPRODUCT(IF(IFERROR(REGEXMATCH(x, "\w+\d+")), 
     REGEXEXTRACT(x, "\w+(\d+)"), )))(SPLIT(A1, " "))
    

    enter image description here

    update 1:

    =LAMBDA(x, REGEXEXTRACT(A1, "(\D+)\d+")&
     SUMPRODUCT(IF(IFERROR(REGEXMATCH(x, "\D+\d+")), 
     REGEXEXTRACT(x, "\D+(\d+)"), )))(SPLIT(A1, " "))
    

    enter image description here


    update 2:

    =INDEX(LAMBDA(xx, REGEXEXTRACT(xx, "(\D+)\d+")&
     BYROW(LAMBDA(x, IF(IFERROR(REGEXMATCH(x, "\D+\d+")), 
     REGEXEXTRACT(x, "\D+(\d+)"), ))(SPLIT(xx, " ")), LAMBDA(x, SUMPRODUCT(x))))
     (A1:INDEX(A:A, MAX((A:A<>"")*ROW(A:A)))))
    

    if you start from A2 just change A1: to A2:

    enter image description here