Project Name: Project 9 - YAPDNS
Mentor: Pietro Delsante (IT)
Backup mentor: Fedele Mantuano (IT)
Skills required: Python, Django, HTML/JavaScript, PostgreSQL/MySQL
Project type: New tool
Project goal: Collect Passive DNS data from various sources; display, correlate and analyze them.
There are a couple of tools out there to collect Passive DNS data (e.g. passivedns by gamelinux and pdnsd), but they only work by sniffing authoritative DNS answers inside network traffic and by storing them. There is a huge amount of other sources that could be used to collect Passive DNS data: for example, almost every organization has a web proxy or gateway, and its logs almost always contain a domain name, an IP address and a timestamp. The same data set can be extracted from other textual logs from DNS servers (Bind, Microsoft DNS, etc), web servers, IDS/IPS, and even sandboxes (Cuckoo) and honeypots (Thug) or other Passive DNS databases (VirusTotal, DNSDB, etc). YAPDSN should provide an interface (e.g. a Syslog-NG local destination or a Logstash module) to collect basic associations between an IP address and a domain name, along with the first and last time the association was seen. Other data can be added for specific log sources (e.g. DNS logs also contain TTL, record type, etc), or gathered from external repositories (e.g. association with malware in VirusTotal’s database, etc).

YAPDNS should also provide an interface with a search engine, a set of dashboards and some correlation rules (e.g. track by ASN, geolocation, fast-flux behaviour, etc). The tool should also provide some REST-like APIs to facilitate integration with other tools.
YAPDNS should also use HPFriends to facilitate data sharing between various trusted entities. The backend database may either be a relational database (e.g. PostgreSQL) or a non-relational one (e.g. MongoDB or ElasticSearch).

Communication with other projects and software may use the Common Output Format proposed by this draft on IETF.
An alpha version of YAPDNS has been released that includes everything that is needed to read data from DNS logs, store it in Elasticsearch and perform basic searches on the stored results. YAPDNS takes care of some caching on both the client and the server, to avoid storing duplicate values and overloading the server with duplicate input data. You can find more about YAPDNS in the official documentation page.