What Is Open Source Intelligence and How Is it Used?
What Is Open Source Intelligence?
Before we look at common sources and applications of open source intelligence, it’s important to understand what it actually is.According to U.S. public law, open source intelligence:Is produced from publicly available information Is collected, analyzed, and disseminated in a timely manner to an appropriate audience Addresses a specific intelligence requirement The important phrase to focus on here is “publicly available.”The term “open source” refers specifically to information that is available for public consumption. If any specialist skills, tools, or techniques are required to access a piece of information, it can’t reasonably be considered open source. Crucially, open source information is not limited to what you can find using the major search engines. Web pages and other resources that can be found using Google certainly constitute massive sources of open source information, but they are far from the only sources. For starters, a huge proportion of the internet (over 99 percent, according to former Google CEO Eric Schmidt) cannot be found using the major search engines. This so-called “deep web” is a mass of websites, databases, files, and more that (for a variety of reasons, including the presence of login pages or paywalls) cannot be indexed by Google, Bing, Yahoo, or any other search engine you care to think of. Despite this, much of the content of the deep web can be considered open source because it’s readily available to the public.
Information can also be considered open source if it is: Published or broadcast for a public audience (for example, news media content) Available to the public by request (for example, census data) Available to the public by subscription or purchase (for example, industry journals) Could be seen or heard by any casual observer Made available at a meeting open to the public Obtained by visiting any place or attending any event that is open to the public At this point, you’re probably thinking, “Man, that’s a lot of information …” And you’re right. We’re talking about a truly unimaginable quantity of information that is growing at a far higher rate than anybody could ever hope to keep up with. Even if we narrow the field down to a single source of information — let’s say Twitter — we’re forced to cope with hundreds of millions of new data points every day. This, as you’ve probably gathered, is the inherent trade-off of open source intelligence.
How Is Open Source Intelligence Used?
Now that we’ve covered the basics of open source intelligence, we can look at how it is commonly used for cybersecurity. There are two common use cases: 1. Ethical Hacking and Penetration Testing Security professionals use open source intelligence to identify potential weaknesses in friendly networks so that they can be remediated before they are exploited by threat actors. Commonly found weaknesses include: Accidental leaks of sensitive information, like through social media Open ports or unsecured internet-connected devices Unpatched software, such as websites running old versions of common CMS products Leaked or exposed assets, such as proprietary code on pastebins 2. Identifying External Threats As we’ve discussed many times in the past, the internet is an excellent source of insights into an organization’s most pressing threats. From identifying which new vulnerabilities are being actively exploited to intercepting threat actor “chatter” about an upcoming attack, open source intelligence enables security professionals to prioritize their time and resources to address the most significant current threats. In most cases, this type of work requires an analyst to identify and correlate multiple data points to validate a threat before action is taken. For example, while a single threatening tweet may not be cause for concern, that same tweet would be viewed in a different light if it were tied to a threat group known to be active in a specific industry. One of the most important things to understand about open source intelligence is that it is often used in combination with other intelligence subtypes. Intelligence from closed sources such as internal telemetry, closed dark web communities, and external intelligence-sharing communities is regularly used to filter and verify open source intelligence.
Open Source Intelligence Techniques
Now that we’ve covered the uses of open source intelligence it’s time to look at some of the techniques that can be used to gather and process open source information. First, you must have a clear strategy and framework in place for acquiring and using open source intelligence. It’s not recommended to approach open source intelligence from the perspective of finding anything and everything that might be interesting or useful — as we’ve already discussed, the sheer volume of information available through open sources will simply overwhelm you. Instead, you must know exactly what you’re trying to achieve — for example, to identify and remediate weaknesses in your network — and focus your energies specifically on accomplishing those goals. Second, you must identify a set of tools and techniques for collecting and processing open source information. Once again, the volume of information available is much too great for manual processes to be even slightly effective. Broadly speaking, collection of open source intelligence falls into two categories: passive collection and active collection. Passive collection often involves the use of threat intelligence platforms (TIPs) to combine a variety of threat feeds into a single, easily accessible location. While this is a major step up from manual intelligence harvesting, the risk of information overload is still significant. More advanced threat intelligence solutions like Recorded Future solve this problem by using artificial intelligence, machine learning, and natural language processing to automate the process of prioritizing and dismissing alerts based on an organization’s specific needs. In a similar manner, organized threat groups often use botnets to collect valuable information using techniques like traffic sniffing and keylogging. On the other hand, active collection is the use of a variety of techniques to search for specific insights or information. For security professionals, this type of collection work is usually done for one of two reasons: A passively collected alert has highlighted a potential threat and further insight is required. The focus of an intelligence gathering exercise is very specific, such as a penetration testing exercise.