Archive for the ‘Python’ Category

Python Whois class

2009/09/17 1 comment

After too long away from the project I have been trying to implement some additional functionality to my submissions2stats script for parsing Nepenthes log files. Something that I’ve had in mind for a while is utilising Whois data to better analyse the source of the malware submissions.

I had assumed that this would be relatively simple, after all the ability to port any required functionality is an integral part of geek humour. This wasn’t to be the case this time as I was unable to find anything this time around (although I didn’t discover giskismet until after I’d wrote my kistmet2gmapstatic scripts). To cover the functionality I have written a short python class that queries a 3rd party whois service for a provided IP address and provides metods to access the returned data.

The script can be accessed here. Hopefully others will find this of some use. Example output from the script’s .out() method targetting

Whois information for
Origin:           AS2818
Inetnum: –
Netname:      UK-BBC-991005
descr:              BBC
Country:        GB

N.B. Text is tab delimeted in actual usage

I’ve started adding the class’ functionality into my submissions2stats script. So far things are progressing well and hopefully I should be able to have an updated script available shortly.

Andrew Waite

Categories: Python

kismet2gmapstatic: Updated versions

I’ve spent the day adding some additional functionality to my GPS mapping proof of concept (original here).

The second release,, changes the scripts output to wrap the Google maps API call in a self contained HTML page, and contains multiple map images to mitigate the URL length limit.

The third release,, builds on the HTML framework and includes additional information on each mapped access point: SSID, channel and available encryption options.This will likely be the final release of kismet2gmapstatic in this form, the code has grown organically without any real planning and as a result is a hideous mess, but as a PoC I feel it has served it’s purpose. I still have several ideas and additional functionality that I would like to implement, so watch this space for similar tools in the future.

Andrew Waite

Categories: GPS, Python, Tool-Kit, Wireless

kismet2gmapstatic (PoC)

I’m still following my recent interest in wireless networks and devices. In the past month I gained a USB gps reciever (which I forgot to write about, may have a short review shortly). After adding gps capability to my wardrive setup I proceed to scan the local area, then hit a brick wall. There appears (could be my google-fu is failing me) a lack of available tools to meaningfully use the captured data.

After a lot of digging a stumbled upon this script, designed to parse the output from Kismet and generate a .kml file to import into Google Earth. Unfortunately, I’ve been unable to get this to work as Google Earth complains when opening the file. Could be a version issue so your mileage may vary, if anyone does get it working please let me know.

The PerryGeo code did however provide an excellent foundation to utilise the Kismet log file and generate different output. To this end I have released a basic proof of concept script that generates a static map via the Google Maps API. If you want to do anything similar, or want to extend or modify my image code I found the Google documentation to be invaluable.

To the tool itself, starting a disclaimer:

This tool should not be used for illegal or malicious purposes. It was created to visualise network locations and implemented encryption technologies, in an effort to enhance previous analysis of wireless network statistics.

For each discovered access point, the script places a marker on the map, colour coded to level of encryption: Open access points are green, WEP encrypted access points are yellow, whilst WPA encrypted APs are red.

The Google maps API appears to have a limit to the length of URL that it is able to support, as a result the script limits the plotted APs to the first 50 in a given Kismet xml log file. This should be sufficient for site surveys, but is less useful for mapping the results from a wardrive trip. I haven’t manage to locate any firm documentation on this limit, if anyone is able to shed any light or knows a workaround I’d appreciate a heads up.

Below is an example of the tools output (actually, it just outputs the URL, which in turn requests google create the image). The image is created from a subset of data collected during a drive around the Angel of the North.

This is still very early days for the tool (started coding 24hours ago) so any feedback, issues or feature requests would be appreciated. Download available here:

Andrew Waite

Categories: GPS, Python, Tool-Kit, Wireless

Several days of playing working with the raw data and a couple of intermediate scripts (csv & mysql) have paid off. I’m now ready to release the first version of Infosanity‘s Nepenthes log parser.

This utility is substantially larger than my previous two releases (although still small) so I’ll not include source code here, head to Infosanity for the file. Usage is fairly simple, read logged_submissions file into stdin and let the script do it’s job.

Statistics are quite general at this stage, mainly compiling overall statistics from the log file including:

  • Total number of submissions
  • Number of unique malware samples (based on MD5 hashes)
  • Number of unique source IPs
  • Run time
  • Average daily submissions
  • Five most recent submissions

By default the script outputs plaintext to standard out, but this can be changed to HTML via the –output=html commandline flags.

I’m going to hold back releasing any example output from my own servers as I wanted to generate the statistics for use in an upcoming presentation I’m giving for local group Super Mondays. If you’re free and in the area (Newcastle, UK) on May 26th please stop by for the event and to say hi.

If you’re running a Nepenthes server I’d appreciate any feedback or issues running the script. I’m still looking to flesh the system’s capabilities out, so any suggestions/requests for additional features or statistics would be appreciated (contact(no-spam)[at]infosanity[dot]co[dot]uk ).

Andrew Waite

N.B. The latest versions of all Infosanity tools related to statistic generation for Nepenthes can be found here.

Categories: Nepenthes, Python

Utility script in a similar vein to, the script reads Nepenthes’ logged_submissions file from stdin and dumps the information into a MySQL database table.

Initially this serves the same purpose as it’s CSV counterpart, importing the date into system with powerful search and filter functionality. However this may be useful if wanting to work with the data in more complex tools as SQL databases form powerful backends and can be manipulated easily with almost programming language.

(again, apologises for formatting. I’m working on a resource repository for code and tools, hopefully available soon)

UPDATE: Code available from InfoSanity

import sys
import MySQLdb

# Reads Nepenthes logged_submissions file and inserts data to mysql table

#connect to database
db = MySQLdb.connect( host="localhost", user="neplog", passwd="neplog123", db="nepenthes")

#create cursor
cursor = db.cursor()

#read from stdin
while 1:
      line = sys.stdin.readline()
      if not line:

      logData = line.split(' ');

      timestamp = logData[0].strip('[]')
      date = timestamp.split('T')[0]
      time = timestamp.split('T')[1]
      sourceIP = logData[1]
      sourceMalware = logData[4]
      malwareMD5 = logData[5]

      #Insert row
      cursor.execute("insert into submissions values (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")" %( date, time, sourceIP, sourceMalware, malwareMD5) )

Database creation (I’m sure this can be improved, but it works):

CREATE TABLE `submissions` (
`logdate` date default NULL,
`logtime` time default NULL,
`ip` char(15) default NULL,
`url` varchar(64) default NULL,
`MD5` char(32) default NULL

Andrew Waite

Categories: InfoSec, Nepenthes, Python

2009/05/02 Comments off

Whenever I’m analysing large amounts of data I prefer to start the analysis within a spreadsheet as I find the built in capabilities invaluable for some quick and dirty data diving. This typically allows for a good overall understanding of the data set and available statistics without spending time coding before the required statistics are fully understood. A prime example of this was the data I analysed from wireless connections. In this scenario the existing tools are very helpful, airodump-ng’s standard output format is csv, making importing the captured data to a spreadsheet straight forward.

Unfortunately, the same ease of data transfer is not available when working with the logs generated by Nepenthes. To aid this I’ve coded a small python script to read the logged_submissions log file and output the interesting data in .csv format. Admittedly the script is nothing special and can likely be improved on as my coding skills are a bit rusty, but this may be useful to others, or provide a starting point in similar situations.

(n.b. apologises for the rendering, I’m working on it. In meantime cut&paste is the quick and dirty way to view all code.)

UPDATE: Code downloadable from InfoSanity

#!/usr/bin/pythonimport sys

## Reads Nepenthes logged_submissions file and outputs data as comma-seperated value## Typical usage:#   cat logged_submissions | > outputfile.csv## Author: Andrew Waite (aka RoleReversal)#

#write 'headers to stdout'sys.stdout.write("Date,Time,Source IP Address,Malware Source,Malware MD5\n")

#read from stdinwhile 1: line = sys.stdin.readline() if not line:         break

 logData = line.split(' ');

 timestamp = logData[0].strip('[]') date = timestamp.split('T')[0] time = timestamp.split('T')[1] sourceIP = logData[1] sourceMalware = logData[4] malwareMD5 = logData[5]

 out = "%s,%s,%s,%s,%s" %(date, time, sourceIP, sourceMalware, malwareMD5) sys.stdout.write(out)

Hopefully some will find this useful. More nepenthes statistics to come.

Andrew Waite

P.S. thanks to and O’Reilly for providing good on-line references used in the coding of this tool.

Categories: Malware, Nepenthes, Python