Search code examples
regexextractuppercase

Regex Extract First all CAPS word from lines containing Dollar Signs


I have printouts with hundreds of lines, some containing stock symbols in CAPS that I'd like to extract, e.g.

STOCKS OPTIONS SYMBOL GROUPS WORKING
$14,489.60
$14,489.60 Mark WMT D
72%
($24.00)
$45.00 ($153.00) T
2 opt
$500.00 MSFT
100 Sha

I'd like to extract:     WMT   T   MSFT
using online regex testers such as    https://regexr.com/
I spent hours trying expressions such as the following, but no luck yet to just extract just the symbols and none of the other text
$.+[A-Z]\w\s


Solution

  • You didn't specify a programming language so I'll assume PCRE:

    regex

    ^.*\d+.*?\K\b[A-Z]+\b
    

    data

    STOCKS OPTIONS SYMBOL GROUPS WORKING
    $14,489.60
    $14,489.60 Mark WMT D
    72%
    ($24.00)
    $45.00 ($153.00) T
    2 opt
    $500.00 MSFT
    100 Sha
    

    The extracted data is WMT, T, and MSFT

    https://regex101.com/r/N2shwC/1

    In English:

    Find every line with digits and capture the first sequence of all capital letters surrounded by word boundaries.