Google Summer of Code 2009 - Project Ideas

Why get involved with the Honeynet Project and GSOC 2009?

Project Ideas

Below are short descriptions for a number of current projects ideas that we are keen to develop. We are also always interested in hearing ideas for additional relevant honeynet-related R&D projects (although remember to quality for receiving GSOC funding it needs to fit in to Google's 3-month project timescales!).

Each project will have a dedicated, full time mentor to provide a guaranteed contact point each day plus one of more technical advisors to help applicants with the technical direction and delivery of the project (often the original author of the tool or it's current maintener, and usually someone a recognised expert in their particular field). We believe that significant progress can be demonstrated on each of these projects in a 3-month timeframe.

Improving our low interaction client honeypot (phoneyc)

Attacks against Internet users are increasingly delivered through web browsers via client side exploits. Client honeypots have been developed to access potentially malicious web content and attempt to determine whether the content returned is malicious or not. Honeynet Project members have been working on a low interaction, emulated client honeypot called phoneyc (http://code.google.com/p/phoneyc/) that attemps to detect malicious content in a number of ways. It is designed to be faster and more scalable than traditional high interaction client honeypots. In addition, members have been working on a standalone proof of concept called pyprofjsploit (http://code.mwcollect.org/projects/show/pyprofjsploit) to detect malicous shellcode within javascript byte code using the LibEmu generic x86 emulator for shellcode detection (http://libemu.carnivore.it), which is now basically complete.

The goal of this project is to integrate the pyprofjsploit proof of concept into phoneyc and then complete a number of missing features and improvements to phoneyc before formally releasing it publicly. We feel that this project is important because many people will benefit from faster, more reliable ways of automatically detecting malicious websites. The project also provides some interesting technical challenges and has scope for successful applicants to extend this approach with their own ideas.

Skills required:

C programming, Python programming, understanding of Javascript and the DOM model

Mentor: Lance Spitzner
Technical Advisors: Georg Wicherski and Jose Nazario

Improving the effectiveness of low interaction honeypots (Nepenthes/Honeytrap/LibEmu)

Honeynet Project members have developed a number of solutions for emulating vulnerable computer systems and automatically collecting attacks against them. Honeypots such as Nepenthes and HoneyTrap have proven to be successful capturing known attacks but have generally proved difficult to extend and add signatures for newly discovered vulnerabilities. They have also struggled to reliably detect and capture previously unknown, zero day exploits. Shellcode emulation in LibEmu has helped, but integration with existing honeypots has been demanding.

We are currently working on a new low-interaction honeypot which builds on the on lessons learned to date. This will include detection of unknown attacks via LibEmu and better updatability and scalability. The goal of this project is to advance this next generation low interaction client honeypot to the working prototype stage. We believe that this project is important because existing low interaction honeypots are used by a wide range of researchers and organisations to study internet attacks, so increasing attack detection rates will potentially benefit many people with interests in this area.

Skills required:

C programming, Python programming, understanding Windows x86 shellcode.
Previous experience with Nepenthes, Honeytrap and LibEmu would be very useful

Mentor: Lance Spitzner
Technical Advisors: The Giraffe Chapter (confirm names)

Improving our high interaction client honeypots (Capture-HPC and CaptureBAT)

Capture-HPC is one of our most actively developed public projects. Capture-HPC provides a method of driving a real high interaction windows system running within a virtual machine to potentially malicious websites, obtained from sources such as spam or DNS typosquatting. State changes the VM are monitored and malicious activity is detected by measuring unexpected changes. It is regularly used in surveys of malicious websites and has been extended to support a number of Internet enabled applications and file formats. CaptureBAT is the original behavioural analysis tool that Capture-HPC is based on, using Windows API hooking to monitor state.

The goal of this project would be to continue the current planned development of Capture-HPC and CaptureBAT, in the areas of improving data logging and operational management, adding network API hooking, improving statefull operations (pause, failover, etc) and moving result data storage from flat file to a suitable database solution. We also seek input for the future development roadmap of Capture-HPC v3. We believe that continuing to improve Capture-HPC will encourage increased automated analysis of malicious websites, helping to detect new generations of client focused attacks.

Skills required:

C programming, Java programming, familiarity with Windows and Internet Explorer internals

Mentor: Lance Spitzner
Technical Advisors: The New Zealand Chapter (confirm names)

Developing a solution for managing client honeypots (New)

Honeynet Project members have developed a number of leading open source client honeypot solutions for analysing potentially malicious web sites. However, these tools are generally stand alone in nature and don't provide a number of features necessary for large scale, long running analysis excercises such as crawling the top N web sites from Google in a particular category each each day and reporting activity trends. The current tool construction also doesn't encourage centralised submission of suspect URLs and web based reporting.

The goal of this project would be to implement a web based management layer for registering existing instances of client honeypots, submitting URLs for analysis, scheduling analysis runs, persisting results data, summarising trends and presenting the results to to multiple users. Prototype user stories, designs, etc are available, as is access to client honeypot data, but we would welcome additional input on the best means of managing client honeypot workflow and presenting this information through a web interace.

Skills required:

Web development skills in appropriate technologies

Mentor: Lance Spitzner
Technical Advisors: David Watson, The New Zealand Chapter (confirm names)

Alternative approaches for high interaction honeypot data capture systems (Virtual Machine Introspection)

TBC

Automatic generation of IDS signatures (Nebula)

TBC

Developing a user interface for analysing collected low interaction honeypot data (New)

Honeynet Project members have developed a number of leading open source low interaction honeypot solutions that are used to automatically record data about network based malware attacks, such as Nepenthes and Honeytrap. We have a number of active international sensor deployments to collect malware globally and are in the process of rolling out a larger low interaction sensor network during 2009. However, currently there is no publicly available web based reporting interface available for users of such sensor systems.

The goal of this project would be to implement a web based management reporting tool that takes reasonably simple CSV type input from low interaction malware sensors (such as timestamp, source IP, attack type, attacker IP address, MD5sum, etc) plus the output from sandbox/antivirus engine analysis of uploaded malware binary samples, persist it in a database and then present it via a web inteface to multiple distributed users. This interesting data set provides many potential analysis, presentation and visualisation options for interested developers. We have a number of prototype reporting interface examples available internally, or a new system could be developed from scratch. We believe that this project is important as it will help researchers to more easily understand the types of attacks routinely occuring on the Internet today.

Skills required:

Python programming

Mentor: Lance Spitzner
Technical Advisors: David Watson

Improving Honeynet data visualisation (PicViz)

PicViz (http://www.wallinfire.net/picviz) is a parallel coordinates plotter which enables easy scripting from various input (tcpdump, syslog, iptables logs, apache logs, etc..) to visualize data and discover interesting results quickly. Its primary goal is to graph data in order to be able to quickly analyze problems and find correlations among variables. With security analysis in mind, the program has been designed to be very flexible, able to graph millions of events in a single diagram.

The goal of this project would be to extend PicViz to improve presentation of honeynet sourced data, particularly with regards to incident timelines.

Skills required:

Python programming

Mentor: Lance Spitzner
Technical Advisors: Sebastien Tricaud, Raphael Marty

How to become a Google Summer of Code student?

If you are interested in getting started, please:

Have a read of our publicly available information

Consider joining the various public mailing lists and creating yourself a self-service account on our public projects server, so you get a feel for typical project activity

Consider setting up a web page about your ideas, with your account names (email, IRC nic, etc) so that we can recognize you

Contact our Google Summer of Code 2009 contact (Lance Spitzner) and express your interest

We won't be doing formal interviews, but you will be asked to join up to an internal mailing list and an IRC channel so we can discuss project ideas with you. Once there we'lll introduce you to our people and their areas of expertise, and put you in touch with the correct mentor and advisors for each project proposal. Please free to ask any questions you might have about us or the project.

Complete our GSOC 2009 questionnaire. Please provide as much information as possible about your idea, including example of any previous relevant work and suitable project milestones. If you have previously worked on Honeynet Project R&D or submitted patches to our public projects, please highlight this. The better the information you provide to us, the easier it is for us to decide if your proposal is what we are looking for.

Submit your application to Google by their deadline of XXX.

More information about the Honeynet Project

You can find more information about the Honeynet Project here: http://www.honeynet.org/about, including a short video about honeypots (http://old.honeynet.org/misc/files/HoneynetWeb.mov).Please subscribe to our blog here: http://www.honeynet.org/blog to learn more about our recent R&D activities.