Wednesday 28 May 2014

Cyber Threat Intelligence...my attempt at it...

*** update to this post: http://infosecscapes.blogspot.com/2014/06/threatelligence.html

I recently started playing with various open source threat intelligence frameworks. One of which is the collective intelligence framework (CIF):


  • https://code.google.com/p/collective-intelligence-framework/

I am not going to go to much in to the technical bits but the CIF tool collects data from various open source intelligence feeds which include data such as:
  • IP Addresses
  • Domains
  • MD5 - Suspicious malware 
  • etc. theres more to it
The solution works really well but is limited on the interface side of things, but is really good at collecting, normalising data, sharing data, feeding data to appliances (FW/proxies/snort). I like it and will continue using it. Also the data is not really that rich in information

But I decided to play around and make my own threat intelligence solution type of thing. Since I started playing with Python recently I thought it was a good start. I wrote a threat feed collector that will collect data from the same/similar sources as the CIF tool does. I made it easy and generic enough to be able to add new sources as I find them.

It was a success I am collecting this data really well, but I was at a loss of how to store it, view it, make it useful...I was up all night thinking about a database structure and how not to make it horrible...

Then it clicked to me, intelligence data should not be structured, and so I finally had a reason to play with ElasticSearch:


  • http://www.elasticsearch.org/
In short (because I am only scratching at the surface), ElasticSearch is a fast, schema-less "database". I found it to be excellent at storing large data quickly and efficiently. You must really go and research it. I started with a simple article and took it from there:


  • http://joelabrahamsson.com/elasticsearch-101/

So now I am able to collect a lot of data and easily store it and access it. I took it further by enriching the data that I am collecting by looking up country, geo locations etc.


My next problem was a big one that I thought I would be spending the most time on. Which was the interface to view the data. I was thinking about the mammoth task of creating an interface that will make the data searchable, easy to view, and make it informative and eye candy friendly

....another problem solved by the ElasticSearch team, namely Kibana which recently became apart of ElasticSearch it seems. 
  • http://www.elasticsearch.org/overview/kibana/

Kibana essentially is an Dashboard interface for querying ElasticSearch data and presenting it beautifully. Within no time I was building really pretty dashboards using Kibana.

Here are some examples I built...by the way I did this all over 2 days from start to where it is now! Below you can search for the entire database for specific key data...and its fast!





The next steps are to automate the data collection, add a limit to the amount of days to keep data (you want to stay relevant and not sit with old data, especially when it comes to cyber threats.) and even add some sort of alerting.

I started the project on a saturday and got to this point by sunday and even got to spend time with the family. 

Python + Elasticsearch + Kibana = Threatelligence for the win!

I can now focus on collecting data, normalising + correlating and enriching it further...all in an automated fashion. I will push a release to Github soon.




 

No comments:

Post a Comment