What is YARA?
When speaking about malware detection, there are mainly three ways of determining if a file is malicious: signatures, heuristics and string signatures.
The most widespread in the antivirus detection systems is the signature based detection, i.e. based in the HASH of a file, check it against a signature database and see if this file has previously been detected as malware. This kind of signature is useless for the detection of unknown malware, and to evade this system you just need to recompile the code in a different system or change a single bit.
In order to try to stop these evasion methods, the heuristic method is usually the chosen one. This method relies on the behaviour of the executable file and, according to the actions that it performs inside the system, it decides if it’s dealing with a malicious file. The main issue of this method is that, as many legit programs perform suspicious actions, it can generate a big amount of false positives.
Last but not least, there is the method which this article refers to: string signatures. This method is based in another kind of signatures, different from the aforementioned kind. Instead of using HASH signatures, it uses text or binary strings that uniquely identify a malware sample. That way, even if the file has been tampered with, if it still contains those string signatures, the analysts will be able to detect and classify the malware sample.
YARA is a tool that was designed by Víctor Manuel Álvarez mainly for string signature based detection and classification of malware. And I say “mainly” because it can be used in different ways – man shall not live by malware alone.
Using relatively simple rules, YARA goes over the files that it receives and looks for the strings defined in the rule and, if they match certain conditions, it tells you what rules match each file. This can also be applied to running processes.
It is very easy to use and, in order to install it, you just need to download the packet for your system from the project’s website and follow the install instructions if you are Linux (the link is for Ubuntu, but I’m sure you can adapt it to your favourite distro); and for Windows it’s even easier: you just have to unzip the YARA executable and, for the YARA library: double click, next, next, finish.
One of the things I like the most about YARA: it has a Python library that allows you integrate it very easily with your projects!
YARA rules (Yeah! It rulez!)
To learn about YARA rules, let’s start with a simple example:
Dissecting the rule:
We create three text files to test the rule:
Analysing YARA’s output. The file rules.yar contains the rule and the option –s will show the string that matches the rule:
We can see a match in the file test2.txt:
Even if the documentation is very good, the are some undocumented things, like the ones presented by Julia Wolf (@foxgrrl) in the BerlinSides talk entitled Classifying malware with YARA. Unfortunately, I didn’t manage to find it published, so I guess you will only be able to find those problems while fighting against the (YARA) rules.
YARA with binaries
In order to test YARA with binary files, I will set up a scenario. Let’s say we work for an evil corp called The Corp. Inside The Corp, people handle very sensitive patents and in the past it has been the target of some… APTs! We, as part of the The Corporation’s CSIRT, have to be ready for defending our beloved evil corp from possible attacks. The previous attacks analyses reveal that the entry point of all the attacks has been email, and in all of them we have found administrative accounts somewhere in the middle of the code.
As any “good” organization, in The Corporation, a strict naming scheme is followed, and all administrative accounts follow one of the following schemes: either admin.<user> or corp.<user>. Furthermore, in the area we want to defend, we only have Linux systems.
In order to defend The Corporation, we set up a sniffer that captures all the email traffic, extracts the attachments and sends them to YARA.
We prepare the following rule to defend The Corporation:
Some fields that did not appear before are explained in the comments. In order to test the rule, we launch it against a malware simulation, which is no less than the following code after being compiled:
We run it against our rule with the options: –s to watch the strings; -g to watch the tags; and –m to watch the metadata:
The rule has been executed successfully and it has caught our “malware”.
There are many ways you can use YARA and what I have just described is simply an example of what can be achieved with this tool. I encourage you all to give it a try and find out what you can do with YARA.