WavebreakMediaMicro - Fotolia
One of the most popular specialized fields within the security domain is threat intelligence. In the recent years, organizations have been focusing more and more on proactive, preventative security. Within that space, threat intelligence analysis is one of the most successful tools available.
Though threat intelligence analysis is often offered through paid subscriptions to commercial services, it's also possible to get open source threat intelligence via some free data feeds. These tools collect information about observed malicious infrastructures, such as IP addresses and domains, as well as about malware via hashes and other indicators of compromise. These feeds provide preventative and often automated blocks, they assist in operations such as threat hunting, they provide context to ongoing attacks, and they can even lead to successful attacker attribution.
Despite the costs to operate a threat intelligence team as part of the broader security posture of an organization, this is a very valuable service.
Open source threat intelligence
The next step up in threat intelligence analysis is gathering intelligence from public sources on the internet that could indicate something suspicious -- without having access to specific indicators of recent or ongoing attacks.
One example is chatter about a leaked router configuration on a dark web hacking forum or a dumped database on the Pastebin site. Another example could be an employee threatening to hack their employer's system before resigning on their Twitter page.
Any potentially targeted company should be aware of these threats before they materialize in an actual attack. It is much better to prevent a breach than to detect it and subsequently clean up the damage with consultants and lawyers.
The big challenge is to collect the relevant data from a wide range of sources. A potential attacker will not post a clear message on his or her Twitter account that they will attack webserver X tonight at 7 p.m.; correlation is needed.
An attacker might mention webserver X tomorrow, might have mentioned the targeted organization last month on a different forum and might have posted a router configuration on Pastebin a week before that -- indicating that he was active in the offensive security space. The amount of data that must be extracted and monitored -- and its retention window -- can be significant.
Another important requirement is to use as much automation as possible. It is too difficult to manually browse the web looking for this content, let alone to manually correlate different platforms within large time windows.
Options and automation
A level of automation is essential to successful open source threat intelligence collection and analysis. There are many specialized open source threat intelligence providers that collect data from many different sources, both at the request of customer-specific queries and with preconfigured broad terms of the vendor's choice.
Recorded Future is currently the best known paid service in this space, but there are alternatives such as Digital Shadows SearchLight. These commercial offerings can be expensive, but they can add additional visibility by providing information that is only available in the underground forums in which they have a foothold. Because most of these platforms are hosted within the provider's cloud, they require little to no infrastructure setup and maintenance.
Another option is to use the many customizable open source tools available on the internet or to develop custom scripts from the ground up. A good example of what is freely available is Tweepy, a Python library that interacts with the Twitter API. Of course, access to an API needs to be granted first. For Twitter, this is a free service, but other platforms such as Pastebin require a one-off fee or an ongoing paid subscription.
An alternative to using API interaction is the use of a so-called web scraper that can download information from a site such as Pastebin in an automated, scripted fashion. There can be legal issues with the use of these, however, so some research into obtaining permission would be smart.
Using a cloud platform
Due to the cost, many organizations will choose to implement some form of scripted, open source threat intelligence collection that can be built and maintained in-house.
A significant volume of data must be collected before any meaningful correlations can be found and that data will need to be stored somewhere. The great thing about running a simple collection of Python scripts, however, is that there is very little system overhead. Combined with the need for 24/7 operation and for high availability, this makes it an ideal candidate for a public cloud system.
Another benefit is that the use of a cloud platform such as Microsoft Azure or AWS hides the source and intent of the queries from administrators of forums, social media and other websites. Sharing the keywords in a search query could be a breach by itself if they are too specific, but as long as the search keywords are relatively broad, the area of interest will be hard to link to a specific business that is sending search queries.
A multistage, hybrid, open source threat intelligence environment could add to this by, for instance, downloading information about 10 randomly selected businesses or products -- or even an entire sector -- to a staging system located in a public cloud. The tool could also then extract a feed that only contains the actual organization's keywords and brand names into the local business itself for further local correlation.
Open source threat intelligence collection is an interesting field. On one side, it collects technical information, and on the other side, it collects information on people and events. The real science and power lie in the correlation between the two, allowing for the most dynamic and proactive security posture an organization can obtain.
No matter what the solution is, though, there will be some cost involved and there will be some effort required to build and operate a successful open source threat intelligence monitoring platform.