Search code examples
regexyara

Yara Rule - Regex - Matching Wildcard


Regex has always been somewhat of a black box for me.

I believe I need to use some regex to write some of the following yara rules. Yara rules use regex in order to match execution of particular binaries within malware. Knowledge of this is not necessary to answer the question, simply that they use regex.

I've got some basic rules down such as detection of the following programs:

    C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\cdb.exe
    C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe

With the following rules

    cuckoo.filesystem.file_access(/C\:\\Program\ Files\ \(x86\)\\Windows\ Kits\\10\\Debuggers\\x64\\cdb.exe/) or
    cuckoo.filesystem.file_access(/C\:\\Program\ Files\ \(x86\)\\Windows\ Kits\\10\\Debuggers\\x86\\cdb.exe/) or

But if i'm trying to detect execution of the following binaries, that being any file that matches the pattern of beginning with the C:\Program Files\ or C:\Program Files\Microsoft Office and ends with excel.exe

Something like the following?

    cuckoo.filesystem.file_access(/C\:\\*\\Excel.exe/) or

What else needs detection is dnx.exe, perhaps something like this would work:

    cuckoo.filesystem.file_access(/C\:\\*\\dnx.exe/) or

Also need to detect stuff like:

    C:\Program Files\Microsoft Office\root\client\appvlp.exe

Where the root user may be any specific user and would ideally be replaced with a wildcard.


Solution

  • Reading the Yara source, it seems to roll it's own flavor of regex. Only basic constructs are supported:

    • Alternation (|)
    • Concatenation
    • Repetition (*, *?, +, +?, ?, ??, {digit*,digit*}, {digit*,digit*}?, {digit+})
    • Boundaries (\b, \B, ^, $)
    • Grouping ((, ))
    • Character classes (., \w, \W, \s, \S, \d, \D, [...], [^...])
    • Hex escapes (\xHH)
    • Normal escapes (\ + any special character)
    • Anything else is a literal or illegal

    It also supports the regex flags i and s after the end of the expression. (/.../is)

    Please see Regular Expressions Quick Reference for an explanation of the different constructs. Keep in mind only ones listed above are supported by Yara.


    To answer the question, to match Excel.exe under C:\Program Files or C:\Program Files\Microsoft Office or any subdirectory, you could use this:

    cuckoo.filesystem.file_access(/^C:\\Program Files\\(Microsoft Office\\)?(.*\\)?Excel\.exe$/i)
    
    • The ^ and $ are there to anchor the pattern to the start and end of the target string. You could try removing them if the pattern does not match.
    • The (Microsoft Office\\)? is redundant, since (.*\\)? would match any subdirectory under C:\Program Files. I included it to match the question.
    • (.*\\)? matches anything ending in a backslash (\), including more backslashes. I made it optional, to allow for files directly under C:\Program Files to match.
    • The dot (.) needs to be escaped (\.) to match a literal dot, since it is considered a special character.
    • The /i at the end makes the pattern case insensitive, to align with how Windows compares filenames.

    To match dnx.exe anywhere under C:\, you could use this:

    cuckoo.filesystem.file_access(/^C:\\(.*\\)?dnx\.exe$/i)
    

    To match all three binaries in any directory under C:\:

    cuckoo.filesystem.file_access(/^C:\\(.*\\)?(Excel\.exe|dnx\.exe|appvlp\.exe)$/i)