Wednesday, February 25, 2015

Why Hybrid Analysis is not a marketing joke, but a useful technology

In 5 minutes you will know why Hybrid Analysis is useful - and not a marketing joke.

The case

As usual, we were checking reports uploaded to our malware analysis online service. Yesterday, we came by a report of sample* that is actually not that interesting, it is a typical dropper. The only significant aspect about the file at first sight is that it is relatively small (only ~14 KB) and tries to leave as little traces on the system as possible. Nevertheless, since everyone deserves a second chance, we decided to take a closer look and see if we couldn't find something that we could turn into a generic signature for malicious behavior. Generic signatures are great, because they apply to a broad variety of malware and obviously to new variants. We have seen a lot of samples that were uploaded, which were previously unknown to e.g. VirusTotal, but contained a lot of malicious behavior. Anyway, let's dive into the sample.

The first thing I always do is take a look at the signatures that matched. Then, I usually take a look at the network connections and process tree of analyzed processes. This obligatory check on the Hybrid Analysis section sometimes reveals quite interesting annotated disassembly listings (so called "Streams"). Since we can build signatures that fire on any kind of data found in the report, we come by some goodies from time to time.

Hybrid Analysis in action

The following screenshot is taken from the heuristically determined "most relevant" function found with the Hybrid Analysis engine:


We can see a typical pattern used by malware authors to "hide" strings from string-searching algorithms by building/concatenating a string character-by-character, often saving them in a local variable on the stack. This is quite an effective method, because the "final string" is concatenated at runtime so to speak and not lying in memory (i.e. even a process memory scan would not reveal the string, unless the stack/heap is snapshoted in just the right moment). Anyhow, usually these type of strings are API names and used for a GetProcAddress call to lookup the associated virtual address.

Turn it into something useful

The idea we had is the following: if we detect a lot (maybe more than 10) single characters being pushed onto the stack and a reference to GetProcAddress/LdrGetProcedureAddress in the same function/context, then we can assume someone is trying to hide a procedure name lookup from string scanning engines. So we whipped up a signature that does exactly that. Here it is after updating our online service and re-running the sample:


As we can see, there is enough indicators to make the decision that the behavior seen is malicious. This generic signature will fire on any sample uploaded to our service that contains the same or a similar trick. If you are interested in the signature code itself and how it was implemented, please get in touch through our contact form.

Final Notes

In this blogpost we learned that "Hybrid Analysis" (the combination of static analysis on memory dumps/binary files with dynamic runtime data/context information) can add valuable indicators that would have otherwise never been available. That is one of the reasons why VxStream Sandbox can extract more artifacts/indicators to trigger behavior signatures on than most other systems on the market. This does not mean we think our system is the perfect solution, but the underlying technology is solid and we believe that we are developing our software in the right direction.

The full report for this sample: https://www.hybrid-analysis.com/sample/342f9acdb9b89e963761fea283daccf0c7cacaf513a46fd09d9cc89223b9d978/

*SHA256: 342f9acdb9b89e963761fea283daccf0c7cacaf513a46fd09d9cc89223b9d978

Sunday, February 22, 2015

Benchmarking some popular public malware analysis services regarding their "Anti-VM" technology

While checking submissions on our webservice we discovered that someone uploaded a "new" version of Pafish (by a0rtega). Pafish is a demo tool that performs typical anti-VM tricks in use by common and sophisticated malware. The new version of Pafish adds a lot of new VM and system trace checks, especially for VirtualBox. As is known, VirtualBox happens to be the default analysis environment of most sandboxes (including Cuckoo Sandbox's Malwr service and our own Hybrid-Analysis.com's free malware analysis service).

To be honest, the new version of Pafish did detect our virtual machine environment using some of the new methods - and it is impossible to prevent all types of detections ahead of time. More importantly, it is necessary to stay on top of the game and offer a software product that is agile and can adapt quickly. That is one of the principles we try to live up to and thus we always try to improve quickly and update VxStream Sandbox when necessary. A benchmarking tool like Pafish is a perfect development tool, because it's a very straightforward, comparable and easy way to stay on par with typical anti-VM methods. On a side note: the "new" release of Pafish is actually not that new, it was released at the beginning of this year - i.e. it's about two months old. One would think that well established and well known sandbox systems like Malwr, ThreatTrack or Comodo would have adapted by now - but to our surprise this is not the case.

First of all, this is how it should look if you run Pafish v0.4 (current state of VxStream Sandbox):



The green "OK" indicates that the specific check was passed and Pafish was not able to detect that it is running on a virtual machine. Here is one full report of VxStream Sandbox on our webservice (we copied the different console outputs into one screenshot to save space).

This is how the instance of Cuckoo Sandbox running on Malwr performed:



This is how Comodo Instant Malware Analysis performed:



The created "hi_" files indicate detection.

This is how ThreatExpert performed (it seems to be using VMWare as environment - and those checks are very old):



Sadly, Anubis failed to even parse the file, but to their defense the service is not being actively maintained it seems:



Finally, let me quote from ThreatTrack's main page before finishing up with this blogpost:

"Our solutions detect the world's most sophisticated malware – including Advance Persistent Threats (APTs) and targeted attacks – and empower you to completely eliminate those threats from your network."

I am assuming that "the world's most sophisticated malware" does not include simple VM checks *grin*. Nevertheless, I would like to underline that we are not claiming to include the "best anti-VM technology" possible (and that's a big difference to other malware analysis vendors), but at least we try to get the basic homework done. Big vendors that claim they include "high-end technology" should be doing the same (taking care of their homework) and spend less money on marketing bla-bla. Of course, it is a bit unfair to mention Cuckoo Sandbox with its malware service Malwr in this context, because it is based on an open-source tool, but I added it for completeness sakes, as it is the most popular free malware analysis service available.

Update (03/23/15): One of the Cuckoo Sandbox authors complained to us last week that they never claimed to have anti-vm technology included in their sandbox system, which might be true for the sandbox system itself, but this blogpost is focusing on public online services only (which is why we now updated the blogpost title, it might have been a bit inaccurate before), their output and not theoretical capabilities of the tools behind the services. The assumption that an online service demonstrates the latest and greatest version and capabilities of a sandbox system is valid one in our eyes, because a user probably expects an online service to try to analyze malware as good as possible, at least when it is about important aspects of malware analysis.

Tuesday, February 3, 2015

VxStream Sandbox and Hybrid-Analysis.com - Free Malware Analysis - Evolution

What's been happening?


This blogpost will focus around the evolution of the online webservice, the cool features we added over the past weeks and demonstrate them on a couple of real world samples.

In the middle of November last year (so about 10 weeks ago) the automated malware behavior analysis service at www.hybrid-analysis.com was released to the public and since then the delicate flower has been starting to blossom a bit.



So far we've had about a bit more than 2000 analyses with ~1900 unique files, more than 25k behavior signatures matched and we had 50k page views with 6k unique sessions from 107 different countries accessing our service. The overall bounce rate is only 54% with 38% returning visitors, so we have been addressing a targeted audience. This is how the world map looks, if you colorize countries by their frequency of access to the webservice (taken from Google Analytics):



We've also been noticing that people have been using our service more and more frequently during the "work days", so it is a good sign that people are utilizing our service at a professional level:

Also, we received quite a lot of feedback and feature suggestions, that we would like to present to you in the following. Our conclusion so far: we must be doing something right.

New Major Features


Let me start out saying: we added a lot. So many features, that we decided to address only the most important ones.

Supported File Types

Right from the beginning, we had a lot of documents/PDF files being uploaded that weren't supported at first, so we focused on adding to the list of supported file types. Right now you can upload any of the following filetypes:

Documents (new!): .doc, .docx, .rtf, .xls, .xlsx, .ppt, .pptx, .pdf
Executables: any kind of Windows PE file (.exe, .scr, .dll, .pif, .com, etc.).

All of the file types are detected automatically, so you can have any suffix, it will be ignored anyway. As we also had some users request to upload their files in different archive formats, we added support for some common archive types. Right now, you can upload any archive with or without the standard password ('infected') with the following archive format:

zip, 7z, xz, bzip2, gzip2, tar, wim

We also added support for uploading multiple files in a single archive. For more information on the special syntax required, please get in touch with us using the contact form on our company webpage.

Extended Document Parsing

Of course, an analysis system could just open e.g. a WORD document file and simply watch what's happening (network traffic, dropped files or new processes being created). Often though this approach requires a potentially embedded exploit to trigger, so we added parsers that extract VBA macros or embedded Javascript in PDF files for which the extracted data is piped to our signature interface. This comes in handy, especially if the document exploit doesn't trigger, because e.g. the shellcode/macro itself often contains valuable indicators already, even if it is obfuscated.

Here is an example of VBA macro extraction:



Improved YARA integration

One of our users creates YARA signatures based on extracted process memory strings, so we extended our YARA integration to run especially on these kind of strings to demonstrate the functionality. Actually, we use the small set of YARA rules in-place now to classify RATs:




Parsing Screenshots using Optical Character Recognition (OCR)

Although this idea isn't new, we decided to add some simple OCR parsing of the screenshots our analysis system takes in order to demonstrate how easy it is to add a complex process into the existing system and achieve a result that - due to its generic character - will apply also to unknown samples in the future.


Brushed-up the "Extracted Strings" section

Another nice addition was the brush-up of the extracted strings section, which now includes multiple tabs that list a pre-selected subset of all strings, strings extracted from screenshot parsing, strings extracted from a dropped file/the input sample (binary scan) or strings from the various analyzed processes. With the ability to download all memory-extracted strings we think the new "Extracted Strings" section adds more depth and overview. If you click the "Details" button, more information on the origin of the string (what type of file or event was the cause) is displayed:


Behavior Signatures

Our "daily business" is to add behavior signatures, as they trigger on a variety of different events and offer a quick overview and valuable indicators at the same time. Signatures can trigger on registry accesses, file operations, on strings, created mutants, a specific API call, on AV test results, extracted instructions from our disassembly "streams", and so forth and so on. We have been adding a lot of signatures since the service started (old report run-throughs might not include all the latest and greatest) and nearly doubled our signatures in-place with 180+ signatures serving right now. If you come by a sample that shows behavior you believe is not being reflected by a signature (i.e. you think a specific signature is missing), just let us know and we will add it if possible.

Summary


In this blogpost we presented the major features that were added over the past weeks to the service and sandbox system. Of course, we also had a lot of other smaller features/visual improvements that we implemented to the reports silently (e.g. the display on tablets/mobile phones) and improved the runtime monitor (e.g. better .NET sample loading and monitoring of system processes), but it would be out of the scope of this blogpost to list every tine addition/change. Overall, we believe that our system is moving in the right direction, also based on the feedback we have been getting. We have a very ambitious roadmap for 2015 and will let you know when we reach our next milestone.

We hope you enjoyed this brief summary and continue using our free service and don't stop on the feedback.

One last advertising side-note: If you are interested in purchasing the full version for an on-premise installation (the entire system is available as a standalone) and/or want to run your own private cloud service, please get in touch using our contact form and we will get back to you with more details.