Archive

Archive for the ‘Nepenthes’ Category

Random Malware Analysis

2009/05/22 1 comment

Having recently been left with several hours to kill with nothing but a laptop and my virtual lab I thought I’d try my hand at some rudimentary malware analysis. For a random live sample I selected the most recent submission to my Nepenthes Server.

$ tail -n1 /opt/nepenthes/var/log/logged_submissions
[2009-05-21T19:10:59] 90.130.169.175 -> 195.97.252.143 creceive://90.130.169.175:2526 93715cfc2fbb07c0482c51e02809b937

To start with I wanted to get an idea of what I was dealing with, so I passed the file’s hash to VirusTotal’s Hash Search utility; and promptly found that VirusTotal had no knowledge of this particular hash. Means we could be dealing with a completely new malware strain or variant! or more likely a polymorphic binary creating a unique file signature…

The question was promptly answered when transferring the binary to my analysis machine by AVG, ‘Threat detected: worm/Allaple.b’. Not wanting to take the word of a single AV vendor I proceeded to upload the binary itself to VirusTotal (have I mentioned I like VirusTotal yet?). Sure enough most AV engines agree with AVG’s analysis although there was some dissention over which version of Allaple the sample was. Most AV engines (37/40) flagged file as malicious (Comodo, nProtect and PrevX gave the binary a clean bill of health.)

Beginning with some static analysis, the ‘strings’ utility is always a safe place to start. As I’m using a Windows platform for this analysis I use the SysInternals strings binary. This revealed little, other than confirming the binary is a windows executable (usual ‘!This program cannot be run in DOS mode.’ string) and a reference to Kernal32.dll and some function names (FindFirstVolumeW, GetShortPathNameA, GetConsoleAliasesLengthW, AddConsoleAliasA, GetModuleHandleW, CreateProcessA, GetUserDefaultUILanguage, LocalReAlloc, SetHandleInformation, SetConsoleCursorInfo).

As there was limited information available from a plaintext strings search my next step was to see if the binary had been packed. For this I used PEiD utility, PEiD initially stated that there was ‘Nothing Detected’ although the entropy found within the file (7.93) caused PEiD to suggest that the binary had indeed been packed.

With some basic static analysis undertaken (this could/should have been taken further but my RE/assembly-fu is a bit rusty, especially at 3am) I changed tact and went with some initial behavioural analysis. For an initial run I utilised iDefense’s SysAnalzer tool written by David Zimmer. SysAnalyzer is a great utility for automating behavioural analysis and capturing system changes, from it’s download page:

SysAnalyzer is an automated malcode run time analysis application that monitors various aspects of system and process states.
SysAnalyzer was designed to enable analysts to quickly build a comprehensive report as to the actions a binary takes on a system.

The tool snapshots (not to be confused with VM snapshots) the state of the system, runs a given binary, then snapshots the system after execution before comparing the two snapshots. This can provide some detailed, succinct information to an analyst, but may miss any dynamic and temporary system changes. One weakness (or strength, depending on your perspective) that SysAnalyzer has is that it does not sandbox the malicious binary from the analysis system. Meaning that if the binary is destructive it *will* hose the system it is being analysed on, obviously if you’re utilising virtualisation and snapshop functionality this shouldn’t be an issue.

On starting the analysis, the malicious executable promptly errored (usual Windows’ ‘executable has failed, please send all information to Microsoft’ type pop-up) and SysAnalysis stated that the system was unchanged by the binary. Well that was disappointing, possible some form of VM detection causing the malware to shut down?

Not to be denied, I re-ran the process: Again the executable crashed with Microsoft’s pop-up, but this time SysAnalysis saw some system changes, from API and registry calls to the creation of new processes. However on further analysis the new processes and files were all only related to the DWWIN.exe executable which, as explained here, is part of Windows itself and is the cause of the pop-ups discussed above.

One aspect that may be causing the binary to lock up is that it is isolated from the network. From experience some malware will perform an initial lookup to an external resource, if the code can’t access said resource the malware assumes it is on a closed system and shuts down. To test this theory I re-ran the executable (this time manually, without the SysAnalysis utility) with Wireshark sniffing all network interfaces. As expected the binary crashed with the same error pop-up, reviewing the wireshark capture no traffic was generated outbound to any resource from the infected host.

Another possible reason for malware to refuse to run is newer VM detection techniques. However no evidence of this is present in the API calls captured by SysAnalysis, nor can I find any reference to VM detection capabilities present within the Allaple family from a search of the web. Ideally to test this theory the malware would be executed on a natively installed OS to bypass any potential VM detection. Unfortunately at this stage I do not have resources available to sacrifice a physical machine in this manner, so analysis must stop here.

One final possibility is simply that the binary is defective, just because the malware is spreading does not necessarily mean that the payload delivered upon exploitation is fully functional. It is not uncommon to have one malware strain being propogated by an entirely different strain. This is rapidly becoming more prevelant as ‘cybercrime’ (I hate that phrase) matures with the recent emergence of crimeware-as-a-service.

What-ever the reason for the binary failing to have any perceivable impact on the system, the behaviour that has been observed during this sample’s execution does not match that which is expected from other analysis of the Allaple.b malware strain. Sophos’ analysis for example, states that upon infection Allabple.b will:

  • When first run W32/Allaple-B copies itself to [system]\urdvxc.exe.
  • The W32/Allaple-B is registered as a COM object.
  • W32/Allaple-B installs itself as a service with the name “MSWindows”.

No evidence of this behaviour has been seen during analysis, nor are any of the changes present on the system post infection. This is a good example of why there isn’t always a need to panic when AV picks up a malicious item. Until the infection has been analysed in more depth there is no way of knowing how scary the compromise and infection is.

Andrew Waite

Advertisements

May SuperMondays (on a Tuesday)

For those that don’t know I’m scheduled to give a presentation at the upcoming Super Mondays meeting next week. The topic of the presentation is malware honeypots, and is based as a follow up to my original Honeypotting with Nepenthes, and I’m hoping to discuss some statistics generated by my submissions2stats.py script from my honeypot logs.

The session will begin with a demonstration of some new technologies, including ambient kitchens and surface computers. Following this will be a presentation on cultural technology and HCI by Patrick Oliver and a presentation of meaningful technology and her work on digital jewellery by Jayne Wallace, before ending the night with my presentation.

Tickets are free and going fast so register now to reserve your place, event registration.

It is shaping up to be a good night, so look forward to seeing you all there.

Andrew Waite

Categories: Nepenthes, SuperMondays

submissions2stats.py

Several days of playing working with the raw data and a couple of intermediate scripts (csv & mysql) have paid off. I’m now ready to release the first version of Infosanity‘s Nepenthes log parser.

This utility is substantially larger than my previous two releases (although still small) so I’ll not include source code here, head to Infosanity for the submissions2stats.py file. Usage is fairly simple, read logged_submissions file into stdin and let the script do it’s job.

Statistics are quite general at this stage, mainly compiling overall statistics from the log file including:

  • Total number of submissions
  • Number of unique malware samples (based on MD5 hashes)
  • Number of unique source IPs
  • Run time
  • Average daily submissions
  • Five most recent submissions

By default the script outputs plaintext to standard out, but this can be changed to HTML via the –output=html commandline flags.

I’m going to hold back releasing any example output from my own servers as I wanted to generate the statistics for use in an upcoming presentation I’m giving for local group Super Mondays. If you’re free and in the area (Newcastle, UK) on May 26th please stop by for the event and to say hi.

If you’re running a Nepenthes server I’d appreciate any feedback or issues running the script. I’m still looking to flesh the system’s capabilities out, so any suggestions/requests for additional features or statistics would be appreciated (contact(no-spam)[at]infosanity[dot]co[dot]uk ).

Andrew Waite

N.B. The latest versions of all Infosanity tools related to statistic generation for Nepenthes can be found here.

Categories: Nepenthes, Python

submissions2mysql.py

Utility script in a similar vein to submissions2csv.py, the script reads Nepenthes’ logged_submissions file from stdin and dumps the information into a MySQL database table.

Initially this serves the same purpose as it’s CSV counterpart, importing the date into system with powerful search and filter functionality. However this may be useful if wanting to work with the data in more complex tools as SQL databases form powerful backends and can be manipulated easily with almost programming language.

(again, apologises for formatting. I’m working on a resource repository for code and tools, hopefully available soon)

UPDATE: Code available from InfoSanity

#!/usr/bin/python
import sys
import MySQLdb

#
# Reads Nepenthes logged_submissions file and inserts data to mysql table
#

#connect to database
db = MySQLdb.connect( host="localhost", user="neplog", passwd="neplog123", db="nepenthes")

#create cursor
cursor = db.cursor()

#read from stdin
while 1:
      line = sys.stdin.readline()
      if not line:
              break

      logData = line.split(' ');

      timestamp = logData[0].strip('[]')
      date = timestamp.split('T')[0]
      time = timestamp.split('T')[1]
      sourceIP = logData[1]
      sourceMalware = logData[4]
      malwareMD5 = logData[5]

      #Insert row
      cursor.execute("insert into submissions values (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")" %( date, time, sourceIP, sourceMalware, malwareMD5) )

Database creation (I’m sure this can be improved, but it works):

CREATE TABLE `submissions` (
`logdate` date default NULL,
`logtime` time default NULL,
`ip` char(15) default NULL,
`url` varchar(64) default NULL,
`MD5` char(32) default NULL
)

Andrew Waite

Categories: InfoSec, Nepenthes, Python

submissions2csv.py

2009/05/02 Comments off

Whenever I’m analysing large amounts of data I prefer to start the analysis within a spreadsheet as I find the built in capabilities invaluable for some quick and dirty data diving. This typically allows for a good overall understanding of the data set and available statistics without spending time coding before the required statistics are fully understood. A prime example of this was the data I analysed from wireless connections. In this scenario the existing tools are very helpful, airodump-ng’s standard output format is csv, making importing the captured data to a spreadsheet straight forward.

Unfortunately, the same ease of data transfer is not available when working with the logs generated by Nepenthes. To aid this I’ve coded a small python script to read the logged_submissions log file and output the interesting data in .csv format. Admittedly the script is nothing special and can likely be improved on as my coding skills are a bit rusty, but this may be useful to others, or provide a starting point in similar situations.

(n.b. apologises for the rendering, I’m working on it. In meantime cut&paste is the quick and dirty way to view all code.)

UPDATE: Code downloadable from InfoSanity

#!/usr/bin/pythonimport sys

## Reads Nepenthes logged_submissions file and outputs data as comma-seperated value## Typical usage:#   cat logged_submissions | submissions2csv.py > outputfile.csv## Author: Andrew Waite (aka RoleReversal)# http://www.infosanity.co.uk#

#write 'headers to stdout'sys.stdout.write("Date,Time,Source IP Address,Malware Source,Malware MD5\n")

#read from stdinwhile 1: line = sys.stdin.readline() if not line:         break

 logData = line.split(' ');

 timestamp = logData[0].strip('[]') date = timestamp.split('T')[0] time = timestamp.split('T')[1] sourceIP = logData[1] sourceMalware = logData[4] malwareMD5 = logData[5]

 out = "%s,%s,%s,%s,%s" %(date, time, sourceIP, sourceMalware, malwareMD5) sys.stdout.write(out)

Hopefully some will find this useful. More nepenthes statistics to come.

Andrew Waite

P.S. thanks to Python.com and O’Reilly for providing good on-line references used in the coding of this tool.

Categories: Malware, Nepenthes, Python

Honeypotting with Nepenthes

If you’ve got an interest in information security, then there is a good chance that you’ve got a good handle on malware in all it’s (in)glorious forms. The books, articles and war stories are nice, interesting and can result in some improved knowledge but to get a real feel for malware nothing beats live samples. Best way to get live samples? Get infected! To manage this without bringing your network and organisation to it’s knees best practice is a honeypot, in one (or more) of it’s various forms.

For exactly this purpose I’ve been running the Nepenthes application for around 10 months. Nepenthes is a low interaction honeypot which emulates several known vulnerabilities across multiple services in an attempt to capture live malware samples as it is ‘exploited’. The Nepenthes services advertise known vulnerabilities, emulate service interaction to the point of exploit and final store the shellcode/binary provided by the malicious system.

If my honeypot system is any indication, these systems will and do get pounded heavily from prospective intruders, over the lifetime of my honeypot systm I have collected in excess of 850 unique malware samples. In fact when the system was first installed it captured it’s first malicious binary within 30 minutes of gaining a live network connection (in this case an IRC bot).

Nepenthes has the ability to automate a fair chunk of the analysis process by automatically submitting any collected binaries to one of several sandboxes (for example the Norman Sandbox). This can provide analysts with an immediate indication as to the type of malware being dealt with, and perhaps most significantly prevent analysts from utilising resources analysing essentially the same binary/malware. One word of caution however is that the submit process does not always work 100% (this hasn’t been investigated in too much detail, could be Nepenthes, could be the sandboxes not accepting/reviewing the file, could be the winds of fate. As with many things, your mileage may vary.)

As an example of the interactions and logging processed by Nepenthes, below is a log snippet of a malware sample that has just (literally) ‘exploited’ my honeypot. (N.B. IPs edited to protect the guilty):

[12042009 16:36:51 warn module] Unknown NETDDE exploit 76 bytes State 1
[12042009 16:36:51 warn module] Unknown SMBName exploit 0 bytes State 1
[12042009 16:36:51 info handler dia] Unknown DCOM request, dropping
[12042009 16:36:57 info sc handler] i = 1 map_items 2 , map = port
[12042009 16:36:57 info sc handler] bindfiletransfer::amberg -> 9988
[12042009 16:36:57 info sc handler] bindfiletransfer::amberg -> w.x.y.z:9988
[12042009 16:36:57 info down mgr] Handler creceive download handler will download creceive://w.x.y.z:9988/0
[12042009 16:37:12 info mgr submit] File 9604e9c99768c5cd2deb108935356196 has type MS-DOS executable PE for MS Windows (GUI) Intel 80386 32-bit

VirusTotal analysis of this file (MD5 hash: 9604e9c99768c5cd2deb108935356196) indicates it is a member of the Rbot family of malware. When working with and investigating the malware collected by Nepenthes I have found the VirusTotal Hash Search feature to be particularly useful as it allows analysts the ability to search VirusTotal’s extensive database to gain analysis of the file in question purely from the binary’s hash value. This means that you don’t need to transfer the binary itself between systems to upload to the VirusTotal for actual analysis, removing the potential for an unintended double-click causing havok on a network. And if VirusTotal hasn’t seen the file in question you may have something new and exciting to analyse yourself (or an old polymorphic binary….)

The downside of using a low interaction honeypot like Nepenthes is that you are not going to be collecting on the bleeding edge. As the process suggests, as Nepenthes emulates known vulnerabilities, the vulnerabilities in question need to be known and coded into Nepenthes before it will collect any malware exploit the vulnerability. For instance, dispite all the recent hype and media attention this honeypot system as not captured any sample of Conficker/DownAdUp. However, as most new malware will still utilise old vulnerabilities to increase potential targets this isn’t a major limiration (Conficker was somewhat unique in that it originally limited itself to the ms08-067 vulnerability, before expanding it’s repertoire with subsequent variants.)

Honeypots (of any variety) also provide a good return on investment even in environments where the analysis of malware isn’t a primary (or even secondary) concern. As the honeypot server has no legitimate services then the only traffic targetted at the honeypot should be malicious. Placed externally, this can provide an early warning system for attacks that eventually target legitimate systems and can give system administrations a better indication of the types and frequency of attacks that will be directed at live services. Placed internally they can help identify any internal infections, as compromised systems sweep the internal networks for other vulnerable hosts and trigger the honeypot. These logs can also help identify the root cause of any infectiona and potentially the initial infection vector.

Ultimately honeyput systems of all varieties have a myriad of beneficial uses. There is an enormous wealth of high quality information available from the various honey pot organisations, for example Shadowserver, the Honeynet Project and Carnivore.IT (home of Nepenthes).

–Andrew Waite

‘If you know your enemy and know yourself, you need not fear the result of a hundred battles’ – Sun Tzu