GSoC Project #6 - Develop Hybrid Honeypot Architecture

Honeybrid is a network application built to combine the functionalities and advantages of low and high interaction honeypots by intelligently switching the recipients of known and unknown attacks at the network flow level. The main goal of honeybrid is to increase the scalability and the flexibility of honeypots.

Primary Mentor: Georg WicherskiStudent: Robin Berthier
The architecture of Honeybrid will consist of two engines: a Decision Engine and a Redirection Engine. The Decision Engine will filter network traffic, which means it will select network sessions worthy of analysis from the overall traffic received. The Redirection Engine will handle selected network sessions by transparently switching the destination of selected sessions from low-interaction honeypots to the farm of high-interaction honeypots.
The decision to switch the destination is a critical part of the architecture. It can be based on various criteria defined by honeypot administrators. To increase the flexibility of Honeybrid, these criteria will be implemented as modules that can be enabled or disabled in a configuration file. This configuration file allows security researchers to define and apply custom filtering and redirection policies.A prototype of Honeybrid has already been implemented. The goal for this summer is to leverage this prototype into a reliable tool for the community. The development status of the different components of the architecture is the following:

  • Decision Engine: 50% complete. It lacks a flexible configuration parser (bison/flex) and has some memory leaks.
  • Redirection Engine: 80% complete. It does not yet correctly handle TCP options.
  • Log Engine: 60% complete. It lacks a correlation functionality in order to merge the output of redirection modules with the list of network sessions logged. Such correlation would help to identify network sessions carrying interesting attack payloads.

Moreover, the following decision modules have been implemented so far:

  • a HASH() module to redirect only attacks with an unknown payload. This module works by calculating and storing the hash value of each new payload received, using the SHA-1 algorithm. SHA-1 was selected because it is relatively fast and highly resistant to collisions. A redirection is triggered as soon as a new hash value is detected;
  • a SOURCE() module to redirect only the first attack of each source IP;
  • a RANDOM() module to sample randomly redirected attacks;
  • a PACKET() module to redirect network sessions after a given number of packets;

I plan on implementing few other modules to add more functionalities to the architecture:

  • a SNORT() module to be able to use the output of Snort to decide on which attacks to redirect;
  • a CONTROL() module to rate limit outgoing connections from compromised honeypots.

Deliverables: the Honeybrid framework as an open source application, working and tested in a Unix environment.

Original Timeline:

  • June 21: development of the Decision Engine completed (4 weeks),
  • June 28: development of the Redirection Engine completed (1 week),
  • July 12: development of the Log Engine completed (2 weeks),
  • July 19: CONTROL() module implemented (1 week),
  • August 23: SNORT() module implemented (5 weeks).

Updated Timeline (2009-07-05):

  • June 27: source code and file structure cleaned up, threads replaced by events and configuration parser implemented
  • July 31: module handler completed, including a library of function to ease the development of new modules
  • August 23: SNORT() module implemented

A detailed Gantt chart of the different tasks involved to reach these three milestones is available on the sourceforge project page of Honeybrid.

Honeybrid testing

Second milestone reached! Honeybrid has now all its functionalities working and it's time for testing. In order to check that everything works efficiently, I deployed a Windows honeypot to receive traffic from five /24 unused subnets during half an hour. Here are the details of this experiment.

Configuration

Here is a overall diagram of the testing architecture:

(Internet) <=====> [NATing Gateway with Honeybrid] <-------> [Windows Honeypot]

The NATing gateway was configured with the following iptables rules:

Bison/Flex parser

This week I completed an important step which is to integrate a parser in Honeybrid. There are now two new files in the source code:

How to transparently redirect a TCP connection

TCP was built to allow 2 hosts to exchange a stream of packets reliably. Honeybrid must add a third host to this operation when it decides to investigate further a connection. The keys for this process to work are: 1) a replay process that gets the high interaction honeypot to the same state than the low interaction honeypot; and 2) a forwarding process that translates not only IP addresses but also TCP sequence and acknowledgement numbers. Here is how things work in detail:

Honeybrid: combining low and high interaction honeypots

The goal of this post is to introduce myself and my project: my name is Robin Berthier and I just got my PhD from the University of Maryland. I'll be working this summer on improving Honeybrid, a hybrid honeypot architecture. I've been working with honeypot technologies for the past 4 years, and Honeybrid represents a central part of my dissertation. 

Syndicate content