YaraRET (I): Carving with Radare2 & Yara

During the management of forensic cases, there are times when we find ourselves in a dead end, where after the detection of a critical compromise indicator, we have to approach an analysis with weak evidence.

That is why I decided to develop a carving tool based on Yara rule detection. This tool also had to handle raw files in and be able to carry out a wide variety of options on this data in a flexible way, so I decided to use Radare2.

From this combination was born YaraRET, a file carving tool developed in Go, whose stable version is available in the repository of YaraRules: https://github.com/Yara-Rules/YaraRET

The development version can be found in the following repository: https://github.com/wolfvan/YaraRET

So, during the next article the resolution of a fictitious forensic case with YaraRET will be presented, which is based on the combination of several cases that I have been finding for a few months.

The case

Imagine that we are sent a computer that, according to our information, has made a request against an APT33 domain. The management of the incident seems not to have been the most suitable, so it cannot be ruled out that a possible attacker has erased his fingerprints.

As for the equipment, it is an industrial system that uses a version of Windows XP for embedded devices, which deals with very sensitive information and, for this reason, the client asks us to extract as much information about the possible malware existing on the computer, in order to carry out a phase of total eradication of the threat.

After a first look, we found a malware, which is generic and does not seem to be related to the request object of the forensic analysis. The machine logs have rotated and we have no solid clues to hold on to.

Since there is a great variety of signatures in the Yara Rules repository, desperate, I decide to launch the APT33 set of rules against the disk.

We find a match.

Brief comment:

At the time of the writing of the case and its presentation in r2con2018, the main hypothesis was that the actor behind TRISIS was APT33. Now, at the time of writing the article, new sources suggest that it could have been APT28. To show you YaraRET and how it works, it makes no difference whether it was one or the other. In addition, it was surely the US.

The tool

At this point I decide to create a very simple tool that, using the matches from Yara malware rules and, using another set of rules of magic numbers created ad hoc, I carry out the detection and extraction of files.

Thus, when executing the tool, it will carry out the execution of the indicated Yara ruleset, and in case of finding any result, it will execute the magic numbers ruleset integrated in the tool, defining data structures, to subsequently dump the file.

To optimize the execution of the tool, a magic numbers search interval has been defined, so the maximum size parameter is included.

The result is as follows:

Perfect, we have obtained a pyc type file, which, effectively, is a malware associated with APT33.

However, after inquiring into its analysis, we only obtain that it is a malware module.

Other data that we can use are the different compromise indicators that exist about TRISIS. To do this, YaraRET incorporates the possibility of parsing IP addresses and domains to create Yara rules that will later execute against the disk, in the same way as in the previous case.

Again, we have only obtained one result, and it is the same file detected previously.

Using this clue, we could pull the thread in an investigation with other forensic tools more “in use”. However, the communication software of the industrial system is based on pyc libraries, so the number of pyc files is counted in hundreds.

It was necessary to discard, in some way, those legitimate files, which meant carrying out, not only the “silly” extraction of files, but to apply a correlation.

Up to this point, the priority of the tool was to be fast, however, since it was necessary to improve the tool, I decided to waste some time at the beginning of the analysis, in order to define the structures of all the desired files and to be able to take actions in those structures.

That is why I decided to develop a shell mode.

 Which we will see in the next chapter ;)

See also in: