Search code examples
regexconditional-statementsbbedit

Regex to find all words which contain both 'd' and 'n' where other letters are from a to n


To help learners of Braille, I want to filter a list of words to find only those that contain the both the letters 'd' and 'n'. I am using the regex engine present in BBEdit 10.5.13. I have a file contain a list of words, one word per line.

Here is a regex which matches every line, which is of course not what I want.

\w*?(d)?(n)?(?(1)\w*?n|(?(2)\w*?d))\w*

The logic that I imagine is:

\w*?   Match all the letters before the first 'd' or 'n', if there are any
(d)?   If there is a 'd' before the first 'n', capture it
(n)?   If there is an 'n' before the first 'd', capture it
(?(1)  If a 'd' was captured...
\w*?n  ... then match all characters up to the first 'n'
|(?(2) Else if an 'n' was captured...
\w*?d  ... then match all characters up to the first 'd'
))\w*  Continue the match until the end of the word

Obviously, my logic and the logic of my regex are different, since this matches every word whether it contains a 'd' or an 'n' or not. Any help with correcting my logic will be greatly appreciated.

Here's a short extract from the list, containing desired 2 matches: "balding" and "band".

bald
balding
bale
baling
balk
balked
balking
balm
bam
ban
band
bane

Solution

  • This will match exactly what you're looking for.

    ^([a-nA-N]*[Dd][a-nA-N]*[Nn][a-nA-N]*|[a-nA-N]*[Nn][a-nA-N]*[Dd][a-nA-N]*)$
    #Or for lowercase just:
    ^([a-n]*[Dd][a-n]*[Nn][a-n]*|[a-n]*[Nn][a-n]*[Dd][a-n]*)$
    

    Here's a screenshot of it working.