If you’ve got an interest in information security, then there is a good chance that you’ve got a good handle on malware in all it’s (in)glorious forms. The books, articles and war stories are nice, interesting and can result in some improved knowledge but to get a real feel for malware nothing beats live samples. Best way to get live samples? Get infected! To manage this without bringing your network and organisation to it’s knees best practice is a honeypot, in one (or more) of it’s various forms.
For exactly this purpose I’ve been running the Nepenthes application for around 10 months. Nepenthes is a low interaction honeypot which emulates several known vulnerabilities across multiple services in an attempt to capture live malware samples as it is ‘exploited’. The Nepenthes services advertise known vulnerabilities, emulate service interaction to the point of exploit and final store the shellcode/binary provided by the malicious system.
If my honeypot system is any indication, these systems will and do get pounded heavily from prospective intruders, over the lifetime of my honeypot systm I have collected in excess of 850 unique malware samples. In fact when the system was first installed it captured it’s first malicious binary within 30 minutes of gaining a live network connection (in this case an IRC bot).
Nepenthes has the ability to automate a fair chunk of the analysis process by automatically submitting any collected binaries to one of several sandboxes (for example the Norman Sandbox). This can provide analysts with an immediate indication as to the type of malware being dealt with, and perhaps most significantly prevent analysts from utilising resources analysing essentially the same binary/malware. One word of caution however is that the submit process does not always work 100% (this hasn’t been investigated in too much detail, could be Nepenthes, could be the sandboxes not accepting/reviewing the file, could be the winds of fate. As with many things, your mileage may vary.)
As an example of the interactions and logging processed by Nepenthes, below is a log snippet of a malware sample that has just (literally) ‘exploited’ my honeypot. (N.B. IPs edited to protect the guilty):
[12042009 16:36:51 warn module] Unknown NETDDE exploit 76 bytes State 1
[12042009 16:36:51 warn module] Unknown SMBName exploit 0 bytes State 1
[12042009 16:36:51 info handler dia] Unknown DCOM request, dropping
[12042009 16:36:57 info sc handler] i = 1 map_items 2 , map = port
[12042009 16:36:57 info sc handler] bindfiletransfer::amberg -> 9988
[12042009 16:36:57 info sc handler] bindfiletransfer::amberg -> w.x.y.z:9988
[12042009 16:36:57 info down mgr] Handler creceive download handler will download creceive://w.x.y.z:9988/0
[12042009 16:37:12 info mgr submit] File 9604e9c99768c5cd2deb108935356196 has type MS-DOS executable PE for MS Windows (GUI) Intel 80386 32-bit
VirusTotal analysis of this file (MD5 hash: 9604e9c99768c5cd2deb108935356196) indicates it is a member of the Rbot family of malware. When working with and investigating the malware collected by Nepenthes I have found the VirusTotal Hash Search feature to be particularly useful as it allows analysts the ability to search VirusTotal’s extensive database to gain analysis of the file in question purely from the binary’s hash value. This means that you don’t need to transfer the binary itself between systems to upload to the VirusTotal for actual analysis, removing the potential for an unintended double-click causing havok on a network. And if VirusTotal hasn’t seen the file in question you may have something new and exciting to analyse yourself (or an old polymorphic binary….)
The downside of using a low interaction honeypot like Nepenthes is that you are not going to be collecting on the bleeding edge. As the process suggests, as Nepenthes emulates known vulnerabilities, the vulnerabilities in question need to be known and coded into Nepenthes before it will collect any malware exploit the vulnerability. For instance, dispite all the recent hype and media attention this honeypot system as not captured any sample of Conficker/DownAdUp. However, as most new malware will still utilise old vulnerabilities to increase potential targets this isn’t a major limiration (Conficker was somewhat unique in that it originally limited itself to the ms08-067 vulnerability, before expanding it’s repertoire with subsequent variants.)
Honeypots (of any variety) also provide a good return on investment even in environments where the analysis of malware isn’t a primary (or even secondary) concern. As the honeypot server has no legitimate services then the only traffic targetted at the honeypot should be malicious. Placed externally, this can provide an early warning system for attacks that eventually target legitimate systems and can give system administrations a better indication of the types and frequency of attacks that will be directed at live services. Placed internally they can help identify any internal infections, as compromised systems sweep the internal networks for other vulnerable hosts and trigger the honeypot. These logs can also help identify the root cause of any infectiona and potentially the initial infection vector.
Ultimately honeyput systems of all varieties have a myriad of beneficial uses. There is an enormous wealth of high quality information available from the various honey pot organisations, for example Shadowserver, the Honeynet Project and Carnivore.IT (home of Nepenthes).
– <a href=”http://infosanity.wordpress.com/about/bio-andrew-waite/”>Andrew Waite</a>
‘If you know your enemy and know yourself, you need not fear the result of a hundred battles’ – Sun Tzu