mimic-nepstats_v1-1.py

Around a month ago Miguel Jacq got in contact to let me know about a couple of errors he encountered when running InfoSanity’s mimic-nepstats.py with a small data set. Basically if your log file did not include any submissions, or was for a period shorter than 24hours the script would crash out, not the biggest problem as most will be working with larger data sets but annoying non the less.

Amun statistics

Amun has been running away quite happily in my lab since initial install. From a statistic perspective my wor has been made really easy as Miguel Cabrerizo has previously taken one of the InfoSanity statistic scripts written for Nepenthes and Dionaea and adapted it to parse Amun’s submission.log files. If you’re wanting to get an overview of submissions from another Amun sensor the script has been uploaded alongside the other InfoSanity resources and is available here.

Starting with Amun

No single technology can do or handle every situation; the same holds true with honeypot sensors which is why I’m always interested in finding new systems to add to my environment. I’d had Amun on my list of potentials for a while, but after reading a short blog post that suggested install and setup was relatively quick and painless, it got moved up the to-do list.

Determining connection source from honeyd.log – cymruwhois version

InfoSanity’s honeyd-geoip.py script has been useful for analysing the initial findings from a HoneyD installation, but one of weaknesses identified in the geolocation database used by the script was that a large proportion of the source IP addresses connecting to the honeypot environment weren’t none within the database. Markus pointed me in the direction of the cymruwhois (discussed previously)python module as an alternative. I’ve re-written the initial script.

24hrs of HoneyD logs

After an initial setup and configuration of HoneyD I took a snapshot of the honeyd.log file after running for a 24hr period. Running honeydsum against the log file generated some good overview information. There were over 12000 connections made to the emulated network, averaging one connection every 7 seconds. Despite the volume of connections, each source generally only initiated a handful of connections.

Determining connection source from honeyd.log

After getting a working HoneyD environment I wanted to better dig into the information provided by the system. First up was a quick script to get a feel for where the attacks/connections originate from. At first glance I really like the log format that is used by honeyd.log, it is nice an easy to parse. From this I quickly knocked up a python script to parse the honeyd.log file, collect a list of unique source addresses and finally use GeoIP to determine (and count) the county of origin.

Basic HoneyD configuration

After first getting HoneyD up and running previously for a proof of concept I’ve begun a wider implementation of HoneyD to function as the backbone for an upgraded research environment.
HoneyD’s key strength is it’s flexibility, HoneyD’s website contains some sample configuration files that show HoneyD emulating multiple systems running different OSes and applications, a large multi-site network and even a config file to create a honeypot environment for a wireless network. I’ve found these samples immensely useful references for developing custom templates for my own implementation.