Project 12 - Improving shellcode emulation performance

Primary mentor: Felix Leder (DE)
Student: Florian Schmitt

Project Overview:
libemu is a library used for automated shellcode detection and analysis. One of the main features of it is the extraction of OS-API calls. With the used function calls, including used parameters, one can have a quick hint on what the shellcode does, without the need to look at assembler code.
To figure out the function calls, libemu executes the shellcode with a build in emulator, resulting in the downside that this is rather slow.
With the use of a virtualizer, the execution of code could be accelerated. That is among other things, because virtualization software can make use of hardware accelerated virtualization.
The aim of this project is to enhance the performance of libemu by using a virtualizer. Keeping the current API of the library intact, makes sure that the tools currently using libemu, could benefit from the improvements.
I intend to use the open source virtualizer QEMU, because it is performant and offers the possibility to use hardware support.

Project Links:
Redmine Project

Project Plan:

  • 24. May - 13. June: Evaluating required changes in QEMU. Put QEMU into a library so it can be easily used in libemu.
  • 14. June - 04. July: Remove emulator code from libemu. Start rewriting the API-functions using QEMU.
    These should be the low-level functions for memory access, CPU access/control, etc.
  • 05. July - 15. July: Backup, bug fixing.
    By Midterm, basic code execution and control of the QEMU-CPU should work from libemu.
  • 16. July - 30. July: Implement the missing features: environment, shellcode-detection, hooking, etc.
  • 31. July - 07. August: Backup, bug fixing.
  • 08. August - 22. August: Depending on the project progress: implementation of additional features which would extend the API of libemu.

Updates:

  • 1st week (2011-05-23 - 2011-05-29)
    • added the flag '--library' to QEMU's configure script, with which, when its set, QEMU gets linked as an dynamic library. the library is located at i386-softmmu/libqemulib.so.
    • added an init_qemu() function which basically does what main() would do. this is needed because main() doesn't get called in libraries.
    • changed libemu's autoconf/automake scripts so it links the qemu-library.
    • for now, libemu only calls init_qemu() when emu_cpu_new() is called.
    • Plans for next week:

    • figure out how to 'callback' from qemulib to libemu (for hooking api-calls).
    • figure out what changes need to be applied to QEMU to run shellcode (paging, protected mode, ...)
    • perhaps: execute some code in QEMU :)
  • 2nd week (2011-05-30 - 2011-06-05)
    • set up protected mode and paging. I used one page directory and added the possibility to add pages to it. with that, blocks of the memory of the virtual machine can be allocated and used.
    • I decided to use function pointers to callback from QEMU to libemu. that way I get no problem of circular dependencies.
    • Plans for next week:

    • I guess I will start to rewrite the libemu API functions using QEMU. I think I will start with memory related functions.
  • 3rd week (2011-06-06 - 2011-06-12)
    • used QEMU functions to rewrite most of the emu_memory-API.
    • started implementing the emu_cpu-API.
    • basic single stepping with QEMU-cpu is possible now.
    • Plans for next week:

    • improving memory and cpu implementations.
  • 4th week (2011-06-13 - 2011-06-19)
    • testing of memory-related functions and cpu single step. looks good.
    • took a deeper look into the execution-related QEMU source code.
    • Plans for next week:

    • trying to get emu_cpu_run function to work with QEMU-cpu not being in single step mode.
    • think about API-hooking
  • 5th week (2011-06-20 - 2011-06-26)
    • used libemu testsuite to evaluate single stepping
    • shellcode-detection (already works with some samples)
    • Plans for next week:

    • think about API-hooking
    • think about how to unload QEMU
    • Problems:

    • right now I have the problem that I can't completely unload QEMU. if the emu_free() function is called and afterwards the emu_new() function, QEMU crashes.
  • 6th week (2011-06-27 - 2011-07-03)
    • non-singlestep execution works now
    • api-calls get hooked with singlestep and non-singlestep execution now.
    • fixed the problem regarding QEMU unloading. the solution was not to unload it, but to reset the environment.
    • set up a exception handler with which errors can be thrown to libemu while executing code in QEMU.
    • Plans for next week:

    • shellcode-detection
    • clean-up / refactoring
    • Problem / Challenge:

    • if the execution reaches a memory region, which is valid, but only consists of zeros, the execution continues because 0000 is a valid opcode (add [eax], al). it can take some time until the execution halts, if the memory region is large. I want to prevent this from happening.
  • 7th week (2011-07-04 - 2011-07-10)
    • shellcode-detection. it works for most samples, but I want it to be more stable and I want to enhance the performance.
    • mid-term evaluation
    • Plans for next week:

    • shellcode-detection
    • clean-up / refactoring
  • 8th week (2011-07-11 - 2011-07-17)
    • shellcode-detection. it works better now. I traded instruction tracking for a brute-force approach.
    • speed up QEMU initialization by omitting non-relevant initializations.
    • Plans for next week:

    • figure out if it's possible to have multiple virtual machines in QEMU, so that libemu would still be capable of multi-threading
    • shellcode-detection
    • clean-up / refactoring
  • 9th week (2011-07-18 - 2011-07-24)
    • implemented a stepcounter to avoid infinite loops.
    • implemented a minimal heap for memory allocation (malloc / GlobalAlloc)
    • added some API-hooks and userhooks.
    • added possibility to add a file handle to the environment. With that, shellcode used in documents can be analysed.
    • Plans for next week:

    • clean-up / refactoring
  • 10th week (2011-07-25 - 2011-07-31)
    • configure script has a QEMU option now.
    • implemented some more API hooks, to make more samples work
    • clean-up / refactoring
    • Plans for next week:

    • fix installation procedure of qemulib
    • clean-up / refactoring
  • 11th week (2011-08-01 - 2011-08-07)
    • I was pretty busy last week, so there is no real progress here :(
    • Plans for next week:

    • packing things together for the release
  • 12th week (2011-08-08 - 2011-08-14)
    • installation procedure works now
    • code documentation
    • removed compiler warnings
    • readme / installation guide
    • Plans for next week:

    • testing