Search code examples
regexkeyboard-maestro

Regex Find and Replace: Deleting everything except words starting with "#"?


I'm a novice at regex and can't find a way to do this easily. I'd like to delete every word that isn't starting with # and put a comma between them, so for example if I have:

Cookie Recipe for n00bs
#cookie #recipe #chocolate To do this you have to etc...
Bla bla bla mumbo jumbo

I'd like to get as a result:

cookie, recipe, chocolate

If you could help me it'd be great, thanks and have a good day!


Solution

  • You missed to tell which programming language you are using. Here comes an example in PHP, which uses perl compatible regular expressions:

    $text = <<<EOF
    Cookie Recipe for n00bs
    #cookie #recipe #chocolate To do this you have to etc...
    Bla bla bla mumbo jumbo
    EOF;
    
    $pattern = '/((?<=#)\w+)/';
    preg_match_all($pattern, $text, $matches);
    
    echo implode(', ', $matches[0]);
    

    I'm using a so called positive lookbehind assertion (?<=#) which ensures that only words are matched which are preceded by a # but, and this is important, it does not include the # itself into the match. After the lookbehind expression, I'm matching as many word characters \w as possible.

    After that implode() is used to concatenate the resulting matches with a ,. A regex can't be used for that part of the job.

    You can see how this regex works at Regex101.com