Regex has always been somewhat of a black box for me.
I believe I need to use some regex to write some of the following yara rules. Yara rules use regex in order to match execution of particular binaries within malware. Knowledge of this is not necessary to answer the question, simply that they use regex.
I've got some basic rules down such as detection of the following programs:
C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\cdb.exe
C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe
With the following rules
cuckoo.filesystem.file_access(/C\:\\Program\ Files\ \(x86\)\\Windows\ Kits\\10\\Debuggers\\x64\\cdb.exe/) or
cuckoo.filesystem.file_access(/C\:\\Program\ Files\ \(x86\)\\Windows\ Kits\\10\\Debuggers\\x86\\cdb.exe/) or
But if i'm trying to detect execution of the following binaries, that being any file that matches the pattern of beginning with the C:\Program Files\ or C:\Program Files\Microsoft Office and ends with excel.exe
Something like the following?
cuckoo.filesystem.file_access(/C\:\\*\\Excel.exe/) or
What else needs detection is dnx.exe, perhaps something like this would work:
cuckoo.filesystem.file_access(/C\:\\*\\dnx.exe/) or
Also need to detect stuff like:
C:\Program Files\Microsoft Office\root\client\appvlp.exe
Where the root user may be any specific user and would ideally be replaced with a wildcard.
Reading the Yara source, it seems to roll it's own flavor of regex. Only basic constructs are supported:
|
)*
, *?
, +
, +?
, ?
, ??
, {digit*,digit*}
, {digit*,digit*}?
, {digit+}
)\b
, \B
, ^
, $
)(
, )
).
, \w
, \W
, \s
, \S
, \d
, \D
, [...]
, [^...]
)\xHH
)\
+ any special character)It also supports the regex flags i
and s
after the end of the expression. (/.../is
)
Please see Regular Expressions Quick Reference for an explanation of the different constructs. Keep in mind only ones listed above are supported by Yara.
To answer the question, to match Excel.exe
under C:\Program Files
or C:\Program Files\Microsoft Office
or any subdirectory, you could use this:
cuckoo.filesystem.file_access(/^C:\\Program Files\\(Microsoft Office\\)?(.*\\)?Excel\.exe$/i)
^
and $
are there to anchor the pattern to the start and end of the target string. You could try removing them if the pattern does not match.(Microsoft Office\\)?
is redundant, since (.*\\)?
would match any subdirectory under C:\Program Files
. I included it to match the question.(.*\\)?
matches anything ending in a backslash (\
), including more backslashes. I made it optional, to allow for files directly under C:\Program Files
to match..
) needs to be escaped (\.
) to match a literal dot, since it is considered a special character./i
at the end makes the pattern case insensitive, to align with how Windows compares filenames.To match dnx.exe
anywhere under C:\
, you could use this:
cuckoo.filesystem.file_access(/^C:\\(.*\\)?dnx\.exe$/i)
To match all three binaries in any directory under C:\
:
cuckoo.filesystem.file_access(/^C:\\(.*\\)?(Excel\.exe|dnx\.exe|appvlp\.exe)$/i)