I am trying to find a way to use an ARRAYFORMULA and REGEXEXMATCH to parse entries in a column which are downloaded from a Tennis League website into a Google Spreadsheet.
Typically, with Fred Budde (56) / Warren Girle (55)7-6, 7-5Dan McClure (54) / Brian Giesecke (68)
in cell A1, I am looking to copy:-
Fred Budde (56)
into column C row 1,Warren Girle (55)
into column D row 1,7-6, 7-5
into column E row 1,Dan McClure (54)
into column F row 1 andBrian Giesecke (68)
into column G row 1.Of course results could be 2-6,6-2,1-0[11-9]
or 6-2,7-6[10-8]
.
I eventually came up with:-
=ARRAYFORMULA(REGEXEXTRACT($A$2:$A$3,"^(.*?) \((\d+)\) \/ (.*?) \((\d+)\)(.*?)(.*?) \((\d+)\) \/ (.*?) \((\d+)\)"))
but this does not work as it should, separating the first players age from his name, and the others players and result incorrectly too. So I tried using a separate formula in each column with :
=ArrayFormula(IFERROR(REGEXEXTRACT(A1:A, "(\d+-\d+,\s\d+-\d+|\d+-\d+,\s\d+-\d+,\s\d+-\d+|\d+-\d+,\s\d+-\d+,\s\d+-\d+,\s\d+-\d+)")))
where |
represents the 'or' and the two sets results are abstracted - but no others!
If any player names are omitted then the formula should report an 'Error' in column H. If there is an incomplete result, an 'Error' should be reported also in column H.
Would love some help from someone who knows what they are doing!
Use let()
to build the regex in a piecemeal fashion so that the process is easier to manage:
=let(
range, A1:A,
player, "([\w ]+? \(\d+\))",
team, player & " \/ " & player,
result, "([-, \d[\]]+)",
regex, team & result & team,
extract_, lambda(s, regexextract(s, regex)),
iserror_, lambda(s, if(iserror(s), "Error", iferror(1/0))),
data, map(range, extract_),
errors, map(choosecols(data, 1), iserror_),
{ iferror(data), errors }
)
See let(), lambda() and map().
To do the same with an array formula, use this regex:
=arrayformula( regexextract(
A1:A,
"^([\w ]+? \(\d+\)) \/ ([\w ]+? \(\d+\))([-, \d[\]]+)([\w ]+? \(\d+\)) \/ ([\w ]+? \(\d+\))"
) )