Search code examples
regexemailfilterspamprocmail

procmail regex filter mails containing a list specific word patterns


Is it possible to apply a regex in procmail that filters for specific word patterns. For example I could do this with a normal regex:

/(?=.*dolor)(?=.*ipsum)(?=.*sit)/s

This would produce a match with the following text. Where this wouldn't:

/(?=.*money)(?=.*ipsum)(?=.*sit)/s

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

I would want this to adapt for procmail use. And even extend it so instead of just searching for "money" it would also match on "mOney", "möney", "móney" and so on.

Is it possible? If so, how?


Solution

  • Yes, it is possible. Let me show you how.

    Your regex checks if the words dolor, ispum and sit appearing in random order somewhere within the text. The following procmail recipe does the same:

              :0 B
              * -2^0
              * 1^0  \<dorum\>
              * 1^0  \<ipsum\>
              * 1^0  \<sit\>
              action_dorum_ipsum_sit
    

    The first condition contains an empty regular expression which, because it always matches, is used to give your score a negative offset. A match of each of the next rules will increase that score by one (regardless how often each word occurs). At the end, the score will only be positive (and therefore trigger the action) if the text contains all 3 words at least once.

    To add more keywords, you could either add more rules (and decrease the negative offset accordingly) or extend an existing rule, e.g. like this

              * 1^0   \<(mOney|möney|móney)\>