Archive

Archive for the ‘malware’ Category

Book Review: Virtual Honeypots

It took longer than I had wanted, but I have just finished reading through Virtual Honeypots: From Botnet Tracking to Intrusion Detection. The book is written by Niels Provos, creator of HoneyD (among other things) and Thorsten Holz.

Given the authors I had high expectation when the delivery came through, thankfully it didn’t disappoint. Unsurprisingly the first chapter provides an overview of honypotting in general, covering high and low interaction systems over both physical and virtual systems, additionally the chapter introduces some core tools for your toolkit.

The next two chapters cover both high and low interaction honeypots respectively. I really liked the coverage of hi-int honeypots, it was this idea that drew me towards honeypots in the first place the idea of watching an attacker carefully exploit and utilise a dummy system always appealed. The material provided gives a great foundation for starting with a high interaction honeypot and some best practice advice for how to do so securely and safely. While I have read many reports and case studies that involved honeypots I have had difficulty finding in depth setup information and advice, leaving high interaction honeypots feeling a bit like black magic. The author’s information cuts through all the mystery allowing the reader to get a firm understanding of the topic. Likewise the discussion of low-interaction honeypots was equally well covered, although as I’ve spent some time with low-int systems in the past this chapter was more of a refresher than providing unknown information as I had found with the hi-int section.

Given that Neils is one of the books authors, it shouldn’t be too much of a surprise that HoneyD is covered in depth. For me, this was the most useful section of the book. As honeyd is one of the older publicly available low-int systems I had mistakenly assumed that one of the newer systems would provide more functionality, after reading through the material and regularly going ‘ooh’ out loud honeyd is now firmly at the top of my ‘need to implement’ list.

The book also covers honeypot systems that are designed for specialised purposes. For malware collection, the authors mainly focus on Nepenthes, but also touch on Honeytrap among others. This was the only section that I found to be slightly dated, as the Nepenthes’ newly released sprirtual successor Dionaea was not covered. But as the fundamental material is very well explained, Nepenthes is still a very functional system and the inherent similarities between Nepenthes and Dionaea the material still useful regardless so the chapter still provides an excellent foundation if you’re wanting to start collecting malware.

An interesting chapter covers the idea of hybrid honeypots, which is the idea of using low-int systems to monitor and handle the bulk of traffic, while forwarding anything unknown or unusual to a high-int system for more indepth analysis of the attack traffic. Unfortunately at this point openly available hybrid systems are limited, with the more functional systems being kept closed by the researchers and companies that build them (but I have just found Honeybrid while looking for a good link for hybrid systems which I wasn’t aware of. Looks promising…)

The last chapter covering honeypot systems looks at client-side honeypots, designed to look for client-side attacks. As client-side attacks have become more prominent over the last few years this is an evolving area of research, but as the attack vector is newer than traditional attacks, the honeypot systems aren’t as mature as more traditional systems. This isn’t an area that I’m experienced with so I can’t comment too much on the systems detailed by the authors, but they cover several honeyclient systems in great detail, and I’m intending to use the chapter as a foundation for implementing the systems and techniques proposed.

As well as detailing the use of honeypot systems, the authors also provide a brilliant discussion of ways that attackers (or users) can determine that they are interacting with a honeypot system. While the detailed descriptions for ways to identify a honeypot system is interesting and important from a theoretical standpoint, from previous experience running honeypot systems there are more than enough attackers and automated threats that blindly assume the system is legitimate to still enable honeypots to provide plenty of benefit to the honeypot administrators.

The book finishes up with an fairly detailed discussion of both tracking botnets using the information gathered from honeypot systems (this chapter is available as a sample PDF download from thanks to InformIT, here) and analysing the malware sample reports provided by CWSandbox. While both chapters are useful in he context of honeypot systems I didn’t think there was enough room to provide the reader with anything beyond a general overview of the topics, which if you were interested in the topic enough to purchase the book, then the reader will likely already have a similar level of understanding to the information provided.

There is also a chapter covering case studies of actual incidents that were captured by the books authors during their research. I’ve always been a fan of case studies, so enjoyed this chapter, it definitely helps whet the appetite to implement the technologies covered by the book.

Overall I really enjoyed the book, if you’re interested in systems and network monitoring, honeypots or malware then this book should probably be on your bookshelf.

Andrew Waite

Categories: honeypot, infosec, malware

Fuzzy hashing, memory carving and malware identification

I’ve recently been involved in a couple of discussions for different ways for identifying malware. One of the possibilities that has been brought up a couple of times is fuzzy hashing, intended to locate files based on similarities to known files. I must admit that I don’t fully understand the maths and logic behind creating fuzzy hash signatures or comparing them. If you’re curious Dustin Hurlbut has released a paper on the subject, Hurlbut’s abstract does a better job of explaining the general idea behind fuzzy hashing.

Fuzzy hashing allows the discovery of potentially incriminating documents that may not be located using traditional hashing methods. The use of the fuzzy hash is much like the fuzzy logic search; it is looking for documents that are similar but not exactly the same, called homologous files. Homologous files have identical strings of binary data; however they are not exact duplicates. An example would be two identical word processor documents, with a new paragraph added in the middle of one. To locate homologous files, they must be hashed traditionally in segments to identify the strings of identical data.

I have previously experimented with a tool called ssdeep, which implements the theory behind fuzzy hashing. To use ssdeep to find files similar to known malicious files you can run ssdeep against the known samples to generate a signature hash, then run ssdeep against the files you are searching, comparing with the previously generated sample.

One scenarios I’ve used ssdeep for in the past is to try and group malware samples collected by malware honeypot systems based on functionality. In my attempts I haven’t found this to be a promising line of research, as different malware can typically have the same and similar functionality most of the samples showed a high level of comparison whether actually related or not.

Another scenario that I had developed was running ssdeep against a clean WinXP install with a malicious binary. In the tests I had run I haven’t found this to be a useful process, given the disk capacity available to modern systems running ssdeep against a large HDD can be a time consuming process. It can also generate a good number of false positives when run against the OS.

After recently reading Leon van der Eijk’s post on malware carving I have been mulling a method for combining techniques to improve fuzzy hashing’s ability to identify malicious files, while reducing the number of false positives and workload required for an investigator. The theory was that, while any unexpected files on a system are not desirable, if they aren’t running in memory then they are less threatening than those that are active.

To test the theory I infected an XP SP2 victim with a sample of Blaster that had been harvested by my Dionaea honeypot and dumped the RAM following Leon’s methodology. Once the image was dissected by foremost I ran ssdeep against extracted resources. Ssdeep successfully identified the malicious files with a 100% comparison to the maliciuos sample. So far so good.

With my previous experience with ssdeep I ran a control test, repeating the procedure against the dumped memory of a completely clean install. Unsurprisingly the comparison did not find a similar 100% match, however it did falsely flag several files and artifacts with a 90%+ comparison so there is still a significant risk of false positives.

From the process I have learnt a fair deal (reading and understanding Leon’s methodolgy was no comparison to putting it into practice) but don’t intend to utilise the methods and techniques attempted in real-world scenarios any time soon. Similar, and likely faster, results can be achieved by following Leon’s process completely and running the files carved by Foremost against an anti-virus scan.

Being able to test scenarios similar to this was the main reason for me to build up the my test and development lab which I have described previously. In particular, if I had run the investigation on physical hardware I would likely not have rebuilt the environment for the control test with a clean system, losing the additional data for comparison, virtualisation snap shots made re-running the scenario trivial.

–Andrew Waite

P.S. Big thanks to Leon for writing up the memory capture and carving process used as a foundation for testing this scenario.

Analysis: Honeypot Datasets

Earlier this week Markus released two anonymised data sets from live Dionaea installations. The full write-up and data sets can be found on the newly migrated carnivore.it news feed here. Perhaps unsurprisingly I couldn’t help but run the data through my statistics scripts to get a quick idea of  what was seen by the sensors.

This caused some immediate problems, before the data was released Markus had contacted me to point out/complain that the performance from my script is ideal. Performance wasn’t an issue I had encountered, but the database from the sensor I run is ~1MB, the smaller of the released data sets is ~300MB, with the larger being 4.1GB. I immediately tried to rectify the problem and am proud to report,…

I failed miserably. I had tried to move some of the counting and loops from the python code and migrate to more complex SQL queries, working on the theory that working with large datasets should be more efficient within databases as they are designed for working with sets of data. Theory was proved false, actually increasing run-time by about 20%, so I won’t be releasing the changes. Good job I’ve never claimed to be a developer. All this being said, the script still crunches through the raw data in 30seconds and 3minutes respectively.

Without further ado, the Berlin data-set:

Statistics engine written by Andrew Waite – www.infosanity.co.uk

Number of submissions: 2726
Number of unique samples: 133
Number of unique source IPs: 639

First sample seen: 2009-11-05 12:02:48.104760
Last sample seen: 2009-12-07 11:13:55.930130
SystemrRunning: 31 days, 23:11:07.825370
Average daily submissions: 87.935483871

Most recent submissions:
2009-12-07 11:13:55.930130, 10.48.60.253, http://zonetech.info/61.exe, ae8705a7b4bf8c13e5d8214d374e6c34
2009-12-07 11:12:59.389940, 10.13.103.23, ftp://1:1@10.101.229.251:61751/ssms.exe, 14a09a48ad23fe0ea5a180bee8cb750a
2009-12-07 11:10:27.296370, 10.13.103.23, tftp://10.13.103.23/ssms.exe, df51e3310ef609e908a6b487a28ac068
2009-12-07 10:55:24.607140, 10.183.36.128, tftp://10.183.36.128/ssms.exe, df51e3310ef609e908a6b487a28ac068
2009-12-07 10:43:48.872170, 10.183.36.128, ftp://1:1@10.20.216.112:53971/ssms.exe, 14a09a48ad23fe0ea5a180bee8cb750a

And Paris:

Statistics engine written by Andrew Waite – www.infosanity.co.uk

Number of submissions: 749518
Number of unique samples: 2064
Number of unique source IPs: 30808

First sample seen: 2009-11-30 03:10:24.591650
Last sample seen: 2009-12-07 08:46:23.657530
SystemrRunning: 7 days, 5:35:59.065880
Average daily submissions: 107074.0

Most recent submissions:
2009-12-07 08:46:23.657530, 10.46.210.146, http://10.9.0.30:3682/udqk, d45895e3980c96b077cb4ed8dc163db8
2009-12-07 08:46:20.985190, 10.98.174.44, http://10.200.78.235:2708/lzhffhai, 94e689d7d6bc7c769d09a59066727497
2009-12-07 08:46:21.000540, 10.204.219.219, http://10.38.56.49:6968/tyhxqm, 908f7f11efb709acac525c03839dc9e5
2009-12-07 08:46:18.398500, 10.174.62.175, http://10.108.210.203:3058/pghux, ed12bcac6439a640056b4795d22608da
2009-12-07 08:46:15.753080, 10.39.96.46, http://10.132.244.66:3255/dhti, 94e689d7d6bc7c769d09a59066727497

Still need to dig further into the data, they’ll be another post in the making if I uncover anything interesting…

– Andrew Waite

Categories: Dionaea, honeypot, malware

Expert speaker session at Northumbria University

Last week I had the pleasure of being asked to speak at Northumbria University, presenting to students of the Computer Forensics and Ethical Hacking for Computer Security programmes. As I graduated from Northumbria a few years ago it was interesting to come back to see some familiar faces and have a look at how the facilities had developed.

Despite the nerves of having to speak in front of a crowd I really enjoyed the event, especially as the other speakers were excellent and I enjoyed their sessions. The event kicked off with Dave Kennedy, a soon to retire member of Durham Police’s computer crime unit. Dave’s talked about his personal experience with a couple of high profile cases, explaining some of the groundwork and behind the scenes activity that isn’t known to the general public. I found the information interesting; but also disturbing, given the nature of the material that is handled by Dave and his department I can safely state that I wouldn’t want to have much experience in the area.

Next up was Phil Byrne, an internal auditor for HM Revenue and Customs (HMRC). For those that don’t know, HMRC were/are at the centre of one of the UK’s largest data loss stories in 2007 after CDs containing approximately 25 million child benefit records were sent, unencrypted, by standard post and did not reach their intended destination (some backstory here). Phil talked openly about the incident, discussing both the incident itself and the changes made in response. One of Phil’s comments has stayed with me (if I’m mis-quoting someone let me know):

If you put people into the process, something will go wrong at some time

Third to the stand was Gary Witts, owner of a manage services company specialising in on-line backups. The talk was very indepth and had some interesting content, but from my perspective I felt it was more of a sales pitch than a technical discussion of the secure backup’s place within a security standing.

I took the fourth and final slot of the day, which left me with the unenviable position of being between around 100 students and the pub, which didn’t help my usual rapid-fire presentation style. My presentation took a different focus from the previous sessions, discussing some of the real-world security incidents that can regularly be encountered, and some advice on handling the incidents in question. I also discussed my findings from honeypot systems, introducing a less common method for monitoring an environment for malicious activity. Assuming the feedback I’ve recieved is genuine the presentation seems to have been well-recieved.

From a student’s perspective; Tom was in the audience and has been writing up his take on the event in a series of blog postings. Tom also recorded the talks, for any one interested a direct link to my session is available here.

Andrew Waite

Article Review: Carving malware from memory

I’ve recently had the pleasure of talking with Leon van der Eijk which resulted in me getting the opportunity to review an article he had been working on. The focus of the article is to identify and collect malware samples from running processes within volatile memory. Given my predilection for malware collection and analysis Leon correctly guessed that I would enjoy the article, which does a great job of describing a method for collecting and analysing malware (and other files and processes) from RAM on a live Windows system

Leon’s method utilises Meterpreter’s memdump.rb script to collect the a snapshot of an infected system’s memory, then utilises Foremost to carve up the collected memory image into individual files which can then be analysed as normal. As the article has just been published today I won’t try to improve on the work already, but I would suggest giving it a read here.

My own forensics skills aren’t yet up to the level that I would like, but I was able to replicate Leon’s process relatively easily within my own lab environment, and without too many problems. This, along with my experience at Northumbria University last week (more later), has re-ignited my interest in improving my forensic skills, and has proved to me that some of the basic skills and techniques involved with the forensic process isn’t all black magic.

The article is definitely worth a read if you have an interest in either computer forensics and/or malware analysis. In case you missed it above, link to article: Carving malware from live memory. Keep up the good work Leon.

Andrew Waite

Starting with Dionaea

As my previous post states, my Nepenthes system has been retired. In it’s place I’m building up a Dionaea system. The new features proposed by Dionaea should go a long way to improving on a couple of Nepenthes’ shortcomings, a good comparison of the two systems can be found on the Nepenthes blog (post October 27th). But what really caught my attention was the recent post on November 6th detailing the improved logging capabilites that are going to be built into Dionaea. I intend to cover these features at a later date once I’ve had more time to get used to the new system.

I must admit that I was shocked with the ease of installation and compilation. The instructions on Dionaea’s home page look a bit long winded to me, especially as I’m used to the ease of ‘apt-get’ and past experience with manual compilation of source code always leaves me expecting a headache. This was doubled when I discovered my available hardware is starting to show signs of it’s age, and was unable to successfully complete a fresh install of the latest Ubuntu, resulting in some of my components not quite meeting the written requirements. Some how though I manage to muddle through the compilation instructions without issue, and now have a working Dionaea install.

Getting the system started was also a breeze, one-line command as prescribed in the documentation and the system is live. Unsurprisingly it didn’t take long get my first hits, retrieving my first binary within 40 minutes of first starting the system. As I restarted several times whilst playing with config settings it could be that I missed a compromise that would have shortened this time frame in the real world.

So far I have only made a couple of changes the config, replacing the dev’s email with my own to recieve sandbox reports for collected binary samples (thanks for pointing that out in the mailing lists, probably would have missed it) and enabling the ihandler for p0f to try and take advantage of the system’s included fingerprinting capabilities.

As I’ve always liked statistics from honeypot systems, here is what I’ve got so far:

  • Running approximately 4 hours
  • Logged 20 unique attacks
  • Retrieved 4 unique malware binaries (and received the third party sandbox reports)
  • Generated 10,000+ log entries

Finally, thanks to the dev team for continuing to build and improve systems that I love to use. Couldn’t do halve of what I do without quality systems to work with.

Andrew Waite

Categories: Dionaea, honeypot, malware

Last Nepenthes Statistics

2009/11/09 Andrew Waite 1 comment

Following on from the move from Nepenthes to Dionaea, I’m decomissioning my Nepenthes server to start afresh with Dionaea. As such I thought I’d share the final statistics using InfoSanity’s statistic script for Nepenthes.

Statistics engine written by Andrew Waite – www.InfoSanity.co.uk

Number of submissions: 4189
Number of unique samples: 1189
Number of unique source IPs: 2024

First sample seen on 2008-05-09
Last sample seen on 2009-10-31
Days running: 540
Average daily submissions: 7

Andrew Waite

Nepenthes is Dead, Long live Dionaea

As regular readers will know (do I have any of those?) I’ve been running a Nepenthes honeypot for a while. Current statistics show that the server ran for 540days, was ‘exploited’ 4189 times, collecting 1189 unique samples (based on MD5 hash) from 2024 source IP addresses.

The latest post (dated October 27th 2009) on the Nepenthes site indicates that development on Nepenthes is coming to a close, stating 7 reasons preventing newer features being implemented with Nepenthes. As a result I’m stopping development on my statistics scripts for parsing the Nepenthes’ log files. The good news is that work on Nepenthes’ spiritual successor is well underway, in the form of Dionaea.

I’m hopefully going to get a Dionaea box up and running in the near future to continue were I’ve left off with Nepenthes, watch this space…

Andrew Waite

Automated Malware & ESXi frustrations

I recently read Christian Wojner’s excellent paper on Mass Malware Analysis and it re-ignited my desire to build an automated environment to improve and speed up my current malware analysis capabilities. The paper details a step by step for duplicating Wojner’s environment, but I as I don’t have any spare equipment I’ve been looking for alternative routes.

Fortunately the paper also explains the theory, thought process and design of the system so that the reader can modify to suit their own requirements. To achieve this I’ve been trying replace the Xubuntu and Virtual Box host with my existing  ESXi environment detailed in previous posts.

With a bit of Googling the vSphere CLI became the obvious choice to replace the control component for the infected machine in the automated malware environment. vmware-cmd.pl provides the functionality to both stop/start virtual guests and to revert the guest to previous snapshots, exactly what is needed for the malware analysis environment. The commands to be utilised would be (– is a double dash):

vmware-cmd.pl –server <ESXi Host> –username <user> –password <pass> /path/to/guest.vmx getstate

vmware-cmd.pl –server <ESXi Host> –username <user> –password <pass> /path/to/guest.vmx start

vmware-cmd.pl –server <ESXi Host> –username <user> –password <pass> /path/to/guest.vmx stop

vmware-cmd.pl –server <ESXi Host> –username <user> –password <pass> /path/to/guest.vmx revertsnapshot

This should have been enough to adapt Wojner’s control scripts to use ESXi instead of Virtual box, but it appears that for the first time I’ve encountered a crippled feature not available in the VMware’s free offering. Running the stop/start/revert commands results in the below exception:

Fault:
SOAP Fault:
———–
Fault string: fault.RestrictedVersion.summary
Fault detail: RestrictedVersionFault

So that’s that, unless I happen to win the lottery (which I don’t play) or someone is able and willing to provide a full ESX license to a struggling researcher (which I don’t expect to happen) I’m back to looking for a replacement Wojner’s VirtualBox control process. On with the next…

Andrew Waite

AV killing with powershell

A colleague recently introduced me to scripting with Powershell. After seeing a couple of examples of it’s strength for handling legitimate administration tasks my devious side came into play and I started imaging havok in my head.

As a starting project for getting to grips with Powershell basics I thought I’d try a proof of concept to replicate Meterpreter’s ability to disable AV and other defence mechanisms within the getcountermeasure function. I love meterpreter, but sometimes you need to work with more primitive native tools, as Powershell is starting to be included by default within Windows systems it is now one of the ‘primitive’ tools. My theory was that this should give me a bit of a challange, without jumping in at the deep end.

Well I was wrong, I guess showing the strength of Powershell this proved not to be a challange at all. The code below reads a list of unwanted processes from a text file, and kills the processes. All in four lines of code (I’m told this could be shortened at the expense of readability)

#read list of AV processes to kill
$avprocs = Get-Content AVprocs.txt

#kill all unwanted processes
foreach( $procname in $avprocs)
{
Stop-Process -name $procname
}
#simples…..

The next time you pop a Windows box don’t dispare, there’s more power available than just batch scripts :D

Andrew Waite

P.S. Before anyone shouts about aiding skiddies, the above code could have some great legitimate uses as well; from automatically cleaning up infected systems to aiding productivity by adding doom.exe to the list of processes ;)

The possibilities are endless, both good and bad.