GSoC 2010 Accepted Projects

Please note that GSoC 2011 has now successfully completed. This content is being retained for reference only.

Cuckoo: A uniform sandbox/sandnet with data collection capabilities
Malware is the raw-material associated with many cybercrime-related activities. Cuckoo is a lightweight solution that performs automated dynamic analysis of provided Windows binaries. It is able to return comprehensive reports on key API calls and network activity. Still under continued development, future goals are to extend APIs that are monitored and
to implement database reporting.

Student: Claudio Guarnieri
Mentor: Felix Leder

Status: 2/5/2011 - Beta version publicly released at

LogAnon: Log anonymization library
LogAnon isa log anonymization library that provides a simple cross-platform API written in C with Python binding. LogAnon demonstrates good default behavior and maintains consistencies between log and network captures. This project was approached from two distinct angles. Gabriel Cavalcante focused on the study and implementation of anonymization methods and created some examples of libwireshark use for the challenging mission of dissecting packets in pcap files. Guillaume Touron worked on the development of a command line tool using the library to anonymize IP addresses.

LogAnon project repository:

Student: Gabriel Dieterich Cavalcante & Guillaume Touron
Mentor: Sebastien Tricaud

Status: 1/16/2011 - reached out to Sebastien
1/22/2011 - no response; broadened audience to Gabriel and Guillaume to get an update.
1/31/2011 - seems like code is more proof of concept. log anonymization is still very important to us and I expect that we either expand this proof of concept or spin up a new project around this.

Hale: Botnet C&C monitor
Hale was developed to offer a powerful open source botnet command and control (C &C) monitor since most existing tools today are modified clients. The main objective is to allow a network of monitors to communicate with each other. Logs and captured files are accessible via a web interface offering overview for tracked botnets.

For more information see: Hale project repository:

Student: Patrik Lantz
Mentor: Angelo Dell'Aera

Status: 1/16/2011 - currently in beta testing
1/22/2011 - some issues around CPU usage were identified; need to be fixed prior to releasing to the public
1/22/2011 - CPU issues were fixed.
2/19/2011 - Additional issues were identified, that Patrick is currently fixing.

VoIP module for Dionaea
This project included development of a VoIP module for the honeypot Dionaea. The VoIP protocol used is SIP since it is the current de facto standard for VoIP. In contrast to some other VoIP honeypots, this module doesn't connect to an external VoIP registrar/server. It simply waits for incoming SIP messages (e.g., OPTIONS or even INVITE), logs all data as
honeypot incidents and/or binary data dumps (RTP traffic), and reacts accordingly, for example, by creating a SIP session including an RTP audio channel. As sophisticated exploits within the SIP payload are currently rare, the honeypot module doesn't pass any code to Dionaea's code emulation engine. This is a potential area for future investigation if such malicious messages are detected.

For more information see: Dionaea project page: Dionaea

Student: Tobias Wulff
Mentor: Markus Koetter

Status: 1/8/2011 - the VoIP module has been created and it plays nicely with dionaea. However, there are still some bugs that prevent it from capturing real world VoIP attacks. We are currently trying to find volunteers that can bring this project to completion.
1/16/2011 - proposed to the project owners to reach out to membership or schedule a session during the upcoming HP workshop to finish up the tool

TraceXploit is a tool used for exploit recurrence which can extract the general structure of an exploit from captured network binary data, then regenerate a new exploit for another attack against the same vulnerability.

Student: Zhongjie Wang & Yongchuan Koh
Primary Mentor: Jianwei Zhuge

Status: 1/8/2011 - Traceexploit is currently in a POC state and not ready for public release. Eugene Teo, however, is picking up where Zhongjie and Yongchuan has left off and we hope that he will bring the tools to a state they can be publicly released.

HV-Sebek: Hardware virtualization for Sebek
During the project, the malware analysis virtual machine monitor (MAVMM) was ported to the Intel platform so it can be used to host the Sebek high interaction honeypot. The goal is to achieve a stealthier analysis platform through the use of hardware virtualization.

Student: Chengyu Song
Mentor: Thanh Nguyen

Status: 2/3/2011 - Unfortunately, there is no usable version of HV-Sebek. Hardware virtualization was causing many roadblocks during the project. The HP is still committed in research in this area and we hope to drive tools like HV-Sebek to a release.

PhoneyC: Malicious PDF capabilities
Malicious code with a PDF document is a serious Internet security threat, and its detection and analysis has become a hot research topic. This project discusses the four following types of attacks related to PDF and adds the capability to detect them to the low interaction client honeypot PhoneyC (modified from jsunpack).

  1. JavaScript API in PDF: These attacks are detected through parsing, extracting JavaScript, some context simulation in the JavaScript interpreter, and vulnerability simulation.
  2. ActiveX API in the HTML script: These attacks are detected by adding corresponding vulnerability simulations to PhoneyC.
  3. Malformed URL attacks: These attacks are detected using regular expression filters.
  4. PDF parsing attacks: These are detected through the checks of some sensitive bytes of the pdf to detect if it is a malformed PDF file.

During the project the research team scanned 328 malicious PDF samples, generating 293 sample alerts, which represents a significant improvement in the capabilities of PhoneyC. For more information, see the PhoneyC project page:

Student: Huilin Zhang
Mentor: Jose Nazario

Status: 2/3/2011 - Unfortunately, this project did not come to a successful completion. The current phoneyc 0.1 release contains a minimal PDF detection feature. It aims at detecting Javascript code embedded in a
PDF file, decoding (if needed) and storing it.

IMHoneypot: Low interaction instant messaging honeypot
IMHoneypot is a low interaction honeypot for different instant messaging protocols using libpurple. By emulating a brain dead user we are able to log incoming events to SQLite, analyze the collected messages (using services like monkeywrench), and download files from URLs to submit them to the Anubis PE sandbox.

Student: Lukas Rist
Mentor: Jamie Riden

Status: 1/8/2011 - Project repository and wiki:; currently in beta testing before tool will be officially released.
1/22/2011 - volunteers testing it and having issues around compilation; contacted lukas to check into it.
1/31/2011 - christian asked group on how to proceed considering the discussion on how to capture real world attacks.
2/3/2011 - Lukas is capturing URLs with the existing tool. Christian is inquiring on what type of URLs those are.
2/7/2011 - Lukas would like to implement new ICQ protocol changes before releasing publicly.

Developing an anomaly detection engine for PhoneyC
The anomaly detection engine resulting from this project determines the conditional probability that a web with specific JavaScript feature values is malicious or safe. The feature values to be learned are extracted from the JS collected using the JS extractor in PhoneyC. Finally a Bayesian classifier uses these feature values to find the conditional probability based on the Gaussian Probability function, and classifies the new web-page into safe or malicious classes based on the Bayes theorem.

PhoneyC project page:

Student: Neha Jain
Mentor: Jose Nazario

Status: 2/3/2011 - Unfortunately, this project did not come to a successful completion.

dnsMole: Botnet related DNS traffic detection
The idea behind this project is that it is possible to detect potential botnet C&C servers and/or infected hosts by observing DNS traffic. The algorithms implemented in this tool are based on academic research and the following ideas:

  1. Anomaly detection for DNS Servers using frequent host selection.
  2. Botnet detection by monitoring group activities in DNS traffic.
  3. Extending black domain name list by using co-occurrence relation between DNS
  4. queries.

Since all of these methods depend on threshold parameters, their modification will impact the performance of detection itself. Currently, dnsMole has the ability to be used as a passive dns sniffer, and can analyze already sniffed network traffic dumped in pcap file format. It supports storing black/white lists in memory which will help classify queries and provide more accurate results.

For more information see: dnsMole project page

Students: Wenxin Yang and Mario Karuza
Mentor: Jeff Nathan

Status: 1/16/2011 - Lukas is currently doing a dry run; once he was able to successfully run the tool, he will reach out to the all mailing list to solicit more volunteers for beta testing.
1/22/2011 - reached out to Lukas for an update.
1/31/2011 - current issue is testing dnsmole. Mario requires DNS data, which is not readily available to him. Christian asked Mario to define what he needs and Christian will reach out to the all mailing list to see whether someone can help out.
2/3/2011 - Christian reached out to all mailing list to solicit testers and data to test on.
2/26/2011 - Mon-Yen from the Taiwan chapter has volunteered to test this on a larger scale. Mon-Yen and Mario are in touch to make this happen.

Developing a PHP Sandbox
This tool is able to analyze the behavior of PHP files by executing them in a sandboxed environment. During the execution it collects the following behavioral information:
Connections to other hosts, attempts to send e-mails, attempts to include() files from remote hosts and file system access. The files to analyze can be submitted using a file upload form, text area or by providing an URL. To achieve this, the sandbox makes use of the funcall PHP extension:

Student: Rostislav Skudnov
Mentor: Mohd Hafiz

Status: 1/16/2011 - reached out to mentor/student to assess status.
1/22/2011 - Lukas testing it; he was able to run it; awaiting feedback from him. Pinged on 1/22/2011 for an update.
1/31/2011 - Lukas identified some issues around analysis times (maybe infinite loop); CHristian reached out to Rostislav and Hafiz to see whether they can comment or check into it.
2/3/2011 - according to Mohd Hafiz little improvements were made and the tool is not ready for a release. We need a new owner, who can continue to drive the tool development forward for a release.

Dionaea – Improvements on the current SMB stack
The improvements mostly focused on the current SMB protocol support for better usability and feasibility. These include the SMB stack improvement, Nmap NSE support, NTLM authentication support, Metasploit Operating System fingerprinting support and Tabular Data Stream protocol introduction. These additional features have further enhanced Dionaea’s ability as a great low interaction honeypot.

Student: Tan Kean Siong
Mentor: Markus Koetter

Status: 1/16/2011 - several improvements have and are continuously being made on dionaea. Download the tool at:

Improvements in the high interaction client honeypot Capture-HPC
This project included the development of a module which keeps track of all the TCP/IP and UDP communications. The module has the capability to make judicious determinations of whether the traffic is malicious or not by comparing with the exclusion list containing patterns for benign connections. This is accomplished through implementation of a WFP callout driver.

Student: Narahari Shankaranarayana
Mentor: Peter Komisarczuk

Status: 1/16/2011 - currently awaiting feedback from the project owners re status.
1/22/2011 - due to vacation time in NZ, we need to await until all project members are back. Peter will schedule a call with Ian and Lam on how to proceed shortly after 1/25/2011.
1/31/2011 - Christian pinged on where they are in regards to the call.
2/3/2011 - Peter currently does not have resources to integrate and test the develop modules resulting in a release.