Whenever I’m analysing large amounts of data I prefer to start the analysis within a spreadsheet as I find the built in capabilities invaluable for some quick and dirty data diving. This typically allows for a good overall understanding of the data set and available statistics without spending time coding before the required statistics are fully understood. A prime example of this was the data I analysed from wireless connections. In this scenario the existing tools are very helpful, airodump-ng’s standard output format is csv, making importing the captured data to a spreadsheet straight forward.
Unfortunately, the same ease of data transfer is not available when working with the logs generated by Nepenthes. To aid this I’ve coded a small python script to read the logged_submissions log file and output the interesting data in .csv format. Admittedly the script is nothing special and can likely be improved on as my coding skills are a bit rusty, but this may be useful to others, or provide a starting point in similar situations.
(n.b. apologises for the rendering, I’m working on it. In meantime cut&paste is the quick and dirty way to view all code.)
UPDATE: Code downloadable from InfoSanity
#!/usr/bin/pythonimport sys ## Reads Nepenthes logged_submissions file and outputs data as comma-seperated value## Typical usage:# cat logged_submissions | submissions2csv.py > outputfile.csv## Author: Andrew Waite (aka RoleReversal)# http://www.infosanity.co.uk# #write 'headers to stdout'sys.stdout.write("Date,Time,Source IP Address,Malware Source,Malware MD5\n") #read from stdinwhile 1: line = sys.stdin.readline() if not line: break logData = line.split(' '); timestamp = logData.strip('') date = timestamp.split('T') time = timestamp.split('T') sourceIP = logData sourceMalware = logData malwareMD5 = logData out = "%s,%s,%s,%s,%s" %(date, time, sourceIP, sourceMalware, malwareMD5) sys.stdout.write(out)