Project 15 - Further extend Capture-HPC with possibility of detecting malicious behavior on Linux Machines

Student: Maciej Szawlowski
Primary mentor: Adam Kozakiewicz
Backup mentor: Paweł Jacewicz

Google Melange:

Project Wiki:

Project Overview:
Capture-HPC is a high-interaction client honeypot developed to detect client-side attacks. It consists of two parts: server and client. Server part manages multiple client instances run on virtualized Windows systems. Recently a basic Capture-HPC client for Linux machines was developed. The main goal of the project is to further extend functionality of this client software and to better integrate it with Linux operating system architecture. As Linux operating systems gain popularity, it is highly probable that soon a new line of threats targeting Linux users will arise. Extending Capture-HPC with functionality proposed below will greatly contribute to the knowledge of attacks against Linux client software, especially the web browsers.

Project Plan:
The deadlines are quite optimistic, but I prefer to have some time left if something goes wrong.

  • April 23rd - May20th : Community Bonding Period, problem analysis + solution proposals.
    The most probable solution is to write linux kernel module. There are three main options: inotify library, syscalls interception or filesystem driver manipulation. These solutions need to be explored before choosing any of them.
  • May 21st : GSoC 2012 coding officially starts
  • 1st June : Solution Proof of Concept
    By proof of concept I mean working module which works on certain file system events eg. file creation. Catching other events will be similiar so working PoC would be first step to success.
  • July 9th - July 13th : Mid Term Assessments
  • 25th June : Kernel module development end + integration with Capture HPC client
    When this milestone will be reached, the kernel module will be fully functional and working together with Capture HPC client module.
  • 20th July : Documentation + code cleanup (if needed)
  • 25th July : End of project
  • August 13th : Suggested "pencils down" date, coding close to done
  • August 20th : Firm "pencils down" date, coding must be done
  • August 24th - August 27th : Final Assessments
  • August 31st : Public code uploaded and available to Google

Project Deliverables:
Linux kernel module will be developed as a result of the project. It's main purpose will be to collect created, modified and deleted files. Also there are some modifications needed to be done in honeypot module to introduce new functionality:

  • Communication with kernel module
  • Internet traffic dump collection
  • Sending collected information as a zipped package to the Capture server

Project Source Code Repository:

Student Weekly Blog: https://www.honeynet.or/blog/350

Project Useful Links:

Project Updates:
After some brainstorm with my mentor I decided to use syscalls interception as a base to the solution. There are three ways that I was thinking of:

  • Syscall table hooking
  • LD_PRELOAD mechanism
  • KProbes instrumentation API

Hooking systemcalls table is quite difficult in different environments. The process depends on kernel version, memory layout etc. Moreover newer kernels have protections eg. readonly memory page, not exported pointer to the table.

LD_PRELOAD mechanism is quite interesting too. It allows to dynamically link against custom libraries. I could write my implementation of IO functions and use them instead of original ones.

I chose KProbes API because it is supported in newer 2.6 kernels. It is also portable to other Linux distros so module would be more useful.

  • 21.05.2012 - 27.05.2012
    • Finally git + redmine working
    • LKM hello world
    • KProbe hello world
    • sys_open intercepted with printk
    • multiple probes registration
    • kernel API + code reading
    • redmine issues set up
  • 28.05.2012 - 03.06.2012
    Done this week:
    • nothing due to tests and project deadlines at studies

    Planned for next week:

    • linked list for opened/created, modified, deleted files
    • hardlinking files if not in list
    • file in /proc for path - timestamp mapping

    Blocking issues:

    • tests and project deadlines at studies
  • 04.06.2012 - 10.06.2012
    Done this week:
    • whitelist of files, so writes to /etc/var/messages are not monitored
    • structure to keep info about modified files

    Planned for next week:

    • file in proc to print linked list results
    • file in proc for configuratgion purposes

    Blocking issues:

    • tests and project deadlines at studies
  • 11.06.2012 - 17.06.2012
    Done this week:
    • proc file working, giving current file list state
    • configured VM (kernel faults were killing me, should have done this earlier) for development purpose

    Planned for next week:

    • sys_open bug fix -
    • introduce more hooks
  • 18.06.2012 -24.06.2012
    Done this week:
    • sys_open bug fixed - workaround using kretprobes
    • fixed problems with excessive printk
    • unlink succesfully hooked

    Planned for next week:

    • fully use kretprobe (with prehandler and data passing)
    • flags for files (OPENED|WRITTEN|DELETED)
    • make output well formatted
  • 25.06.2012 -1.07.2012
    Done this week:
    • hooks catching proper events
    • flags for files (OPENED|WRITTEN|DELETED)

    Planned for next week:

    • hunt down bugs: proc buffer overflow + absolute path creation in unlink


    • bugs, bugs, bugs - debugging kernel API
  • 2.07.2012 -8.07.2012
    Done this week:
    • proc file output reworked
    • reliable unlink absolute paths (changed interception from do_unlinkat to vfs_unlinkat - one level deeper)
    • first tests shows that module is finally stable with epiphany browser

    Planned for next week:

    • module argument with hardlinks path
    • handling proc write to reset list when nothing malicious shows up
    • timestamping list entries
  • 9.07.2012 -22.07.2012
    Done this two weeks:
    • decided to remove strace from original client, had to implement it's functionality
    • do_execve intercepted to collect info about new processes
    • due to constant errors, changed test platform from debian (2.6.32, ext3) to ubuntu 10.10 (2.6.37. ext4)
    • hardlinks implemented - in fact i'm not hardlinking. Instead I move file to specified folder before deletion and trick vfs_unlink to think that it deleted a file.
    • added two new flags EXECUTED and READ
    • hooked sys_read
    • insmod option to turn on/off file collection.
    • when file collection is on, deleting happens only inside specified collection directory (to enable client to clean that directory every benign website). From anywhere else file is moved instead of deleted.
    • First in action test: firefox with vulnerable java plugin + metasploit + CVE 2010-840 (Java statment.invoke Trusted Method Chain) results in malicius actions recorded + malicious files moved to dump directory = SUCCESS!!!

    Planned for next week:

    • more configuration via /proc - turning on/off file collection
    • clear string, so only that string while written to /proc clears gathered kernel data
    • start to rewrite client module
  • 23.07.2012-5.08.2012
    Done this two weeks:
    • configuration via proc - COLLECT_ON, COLLECT_OFF, CLEAN_FILES, CLEAR. For cleaning event list, clearing hardlink directory, turning on/off file collection.
    • client module sends zip file
    • client module parses proc event info
    • client module communicat with server
    • client module functional

    Planned for next week:

    • tpcdump to collect traffic information
    • documentation
    • server wishlist - list of features to improve usability

Well GSoC comes to an end and so is the project. Code can be obtained from the repository. Last days will be used to create ant files for easy build, comment code and make documentation better. In few days the blog entry will be added.