Digital Employees for Operational Efficiency
Digital employees leverage various forms of intelligence—SIGINT (Signals Intelligence), IMINT (Imagery Intelligence), MASINT (Measurement and Signature Intelligence), HUMINT (Human Intelligence), OSINT (Open Source Intelligence), and GEOINT (Geospatial Intelligence)—to significantly enhance operational efficiency within an organization. SIGINT involves the interception and analysis of signals, such as communications or electronic signals, which digital employees can use to monitor and secure communications within the organization, detect unauthorized access, and ensure compliance with regulatory requirements. IMINT, through satellite and aerial imagery, allows digital employees to monitor physical assets and infrastructure, identify potential risks, and optimize logistics and supply chain management by providing real-time visual data.
In addition, digital employees utilize MASINT, which involves the analysis of data derived from the distinctive characteristics of targets, to monitor environmental and operational conditions, predict equipment failures, and enhance preventive maintenance strategies. HUMINT, collected through interpersonal interactions and observations, enables digital employees to gather insights from customer feedback, employee surveys, and market research, thereby improving customer service and employee engagement strategies. OSINT, gathered from publicly available sources, helps digital employees track market trends, competitor activities, and regulatory changes, providing the organization with valuable information to stay competitive and compliant. GEOINT, derived from geographic information systems and geospatial data, supports strategic planning and decision-making by optimizing location-based services, asset tracking, and disaster response initiatives. By integrating these diverse forms of intelligence, digital employees provide a comprehensive understanding of both the internal and external operational environment, driving efficiency and informed decision-making across the organization.
Types of Intelligence
The intelligence cycle is a process of collecting information and developing it into intelligence for use by Intelligence customers. The steps in the process are direction, collection, processing, exploitation, and dissemination.
IC products can either be based on a single type of collection or “all-source,” that is, based upon all available types of collection. Intelligence products also can be produced by one Intelligence element or coordinated with other Intelligence elements, and delivered to Intelligence customers in various formats, including papers, digital media, briefings, maps, graphics, videos, and other distribution methods.
There are six basic intelligence sources, or collection disciplines:
- SIGINT—Signals intelligence is derived from signal intercepts comprising -- however transmitted -- either individually or in combination: all communications intelligence (COMINT), electronic intelligence (ELINT) and foreign instrumentation signals intelligence (FISINT). The National Security Agency is responsible for collecting, processing, and reporting SIGINT. The National SIGINT Committee within NSA advises the Director, NSA, and the DNI on SIGINT policy issues and manages the SIGINT requirements system.
- IMINT—Imagery Intelligence includes representations of objects reproduced electronically or by optical means on film, electronic display devices, or other media. Imagery can be derived from visual photography, radar sensors, and electro-optics. NGA is the manager for all imagery intelligence activities, both classified and unclassified, within the government, including requirements, collection, processing, exploitation, dissemination, archiving, and retrieval.
- MASINT—Measurement and Signature Intelligence is information produced by quantitative and qualitative analysis of physical attributes of targets and events to characterize, locate, and identify them. MASINT exploits a variety of phenomenologies, from a variety of sensors and platforms, to support signature development and analysis, to perform technical analysis, and to detect, characterize, locate and identify targets and events. MASINT is derived from specialized, technically-derived measurements of physical phenomenon intrinsic to an object or event and it includes the use of quantitative signatures to interpret the data. The Director of DIA is both the “Intelligence Community Functional Manager for MASINT” and the “DOD MASINT Manager.” The National MASINT Office (NMO) manages and executes MASINT services of common concern and related activities for the D/DIA in response to National and Department of Defense requirements. If interested in learning more about MASINT, check out the NMO's primer here.
- HUMINT—Human intelligence is derived from human sources. To the public, HUMINT remains synonymous with espionage and clandestine activities; however, most of HUMINT collection is performed by overt collectors such as strategic debriefers and military attaches. It is the oldest method for collecting information, and until the technical revolution of the mid- to late 20th century, it was the primary source of intelligence.
- OSINT—Open-Source Intelligence is publicly available information appearing in print or electronic form including radio, television, newspapers, journals, the Internet, commercial databases, and videos, graphics, and drawings. While open-source collection responsibilities are broadly distributed through the IC, the major collectors are the DNI's Open Source Center (OSC) and the National Air and Space Intelligence Center (NASIC).
- GEOINT—Geospatial Intelligence is the analysis and visual representation of security related activities on the earth. It is produced through an integration of imagery, imagery intelligence, and geospatial information.
What is Open Source Intelligence?
Before we look at common sources and applications of open source intelligence, it’s important to understand what it actually is. According to U.S. public law, open source intelligence:
- Is produced from publicly available information
- Is collected, analyzed, and disseminated in a timely manner to an appropriate audience
- Addresses a specific intelligence requirement
The important phrase to focus on here is “publicly available.” The term “open source” refers specifically to information that is available for public consumption. If any specialist skills, tools, or techniques are required to access a piece of information, it can’t reasonably be considered open source. Crucially, open source information is not limited to what you can find using the major search engines. Web pages and other resources that can be found using Google certainly constitute massive sources of open source information, but they are far from the only sources.
For starters, a huge proportion of the internet (over 99 percent, according to former Google CEO Eric Schmidt) cannot be found using the major search engines. This so-called “deep web” is a mass of websites, databases, files, and more that (for a variety of reasons, including the presence of login pages or paywalls) cannot be indexed by Google, Bing, Yahoo, or any other search engine you care to think of. Despite this, much of the content of the deep web can be considered open source because it’s readily available to the public.
In addition, there’s plenty of freely accessible information online that can be found using online tools other than traditional search engines. We’ll look at this more later on, but as a simple example, tools like Shodan and Censys can be used to find IP addresses, networks, open ports, webcams, printers, and pretty much anything else that’s connected to the internet.
Information can also be considered open source if it is:
- Published or broadcast for a public audience (for example, news media content)
- Available to the public by request (for example, census data)
- Available to the public by subscription or purchase (for example, industry journals)
- Could be seen or heard by any casual observer
- Made available at a meeting open to the public
- Obtained by visiting any place or attending any event that is open to the public
At this point, you’re probably thinking, “Man, that’s a lot of information …”
And you’re right. We’re talking about a truly unimaginable quantity of information that is growing at a far higher rate than anybody could ever hope to keep up with. Even if we narrow the field down to a single source of information — let’s say Twitter — we’re forced to cope with hundreds of millions of new data points every day. This, as you’ve probably gathered, is the inherent trade-off of open source intelligence.
As an analyst, having such a vast quantity of information available to you is both a blessing and a curse. On one hand, you have access to almost anything you might need — but on the other hand, you have to be able to actually find it in a never-ending torrent of data.
How Is Open Source Intelligence Used?
Now that we’ve covered the basics of open source intelligence, we can look at how it is commonly used for cybersecurity. There are two common use cases:
1. Ethical Hacking and Penetration Testing
Security professionals use open source intelligence to identify potential weaknesses in friendly networks so that they can be remediated before they are exploited by threat actors. Commonly found weaknesses include:
- Accidental leaks of sensitive information, like through social media
- Open ports or unsecured internet-connected devices
- Unpatched software, such as websites running old versions of common CMS products
- Leaked or exposed assets, such as proprietary code on pastebins
2. Identifying External Threats
As we’ve discussed many times in the past, the internet is an excellent source of insights into an organization’s most pressing threats. From identifying which new vulnerabilities are being actively exploited to intercepting threat actor “chatter” about an upcoming attack, open source intelligence enables security professionals to prioritize their time and resources to address the most significant current threats.
In most cases, this type of work requires an analyst to identify and correlate multiple data points to validate a threat before action is taken. For example, while a single threatening tweet may not be cause for concern, that same tweet would be viewed in a different light if it were tied to a threat group known to be active in a specific industry.
One of the most important things to understand about open source intelligence is that it is often used in combination with other intelligence subtypes. Intelligence from closed sources such as internal telemetry, closed dark web communities, and external intelligence-sharing communities is regularly used to filter and verify open source intelligence. There are a variety of tools available to help analysts perform these functions, which we’ll look at a bit later on.
The Dark Side of Open Source Intelligence
At this point, it’s time to address the second major issue with open source intelligence: if something is readily available to intelligence analysts, it’s also readily available to threat actors.
Threat actors use open source intelligence tools and techniques to identify potential targets and exploit weaknesses in target networks. Once a vulnerability is identified, it is often an extremely quick and simple process to exploit it and achieve a variety of malicious objectives.
This process is the main reason why so many small and medium-sized enterprises get hacked each year. It isn’t because threat groups specifically take an interest in them, but rather because vulnerabilities in their network or website architecture are found using simple open source intelligence techniques. In short, they are easy targets.
And open source intelligence doesn’t only enable technical attacks on IT systems and networks. Threat actors also seek out information about individuals and organizations that can be used to inform sophisticated social engineering campaigns using phishing (email), vishing (phone or voicemail), and SMiShing (SMS). Often, seemingly innocuous information shared through social networks and blogs can be used to develop highly convincing social engineering campaigns, which in turn are used to trick well-meaning users into compromising their organization’s network or assets.
This is why using open source intelligence for security purposes is so important — It gives you an opportunity to find and fix weaknesses in your organization’s network and remove sensitive information before a threat actor uses the same tools and techniques to exploit them.
Open Source Intelligence Techniques
Now that we’ve covered the uses of open source intelligence (both good and bad) it’s time to look at some of the techniques that can be used to gather and process open source information.
First, you must have a clear strategy and framework in place for acquiring and using open source intelligence. It’s not recommended to approach open source intelligence from the perspective of finding anything and everything that might be interesting or useful — as we’ve already discussed, the sheer volume of information available through open sources will simply overwhelm you. Instead, you must know exactly what you’re trying to achieve — for example, to identify and remediate weaknesses in your network — and focus your energies specifically on accomplishing those goals.
Second, you must identify a set of tools and techniques for collecting and processing open source information. Once again, the volume of information available is much too great for manual processes to be even slightly effective. Broadly speaking, collection of open source intelligence falls into two categories: passive collection and active collection.
Passive collection often involves the use of threat intelligence platforms (TIPs) to combine a variety of threat feeds into a single, easily accessible location. While this is a major step up from manual intelligence harvesting, the risk of information overload is still significant. More advanced threat intelligence solutions solve this problem by using artificial intelligence, machine learning, and natural language processing to automate the process of prioritizing and dismissing alerts based on an organization’s specific needs. In a similar manner, organized threat groups often use botnets to collect valuable information using techniques like traffic sniffing and keylogging.
On the other hand, active collection is the use of a variety of techniques to search for specific insights or information. For security professionals, this type of collection work is usually done for one of two reasons:
- A passively collected alert has highlighted a potential threat and further insight is required.
- The focus of an intelligence gathering exercise is very specific, such as a penetration testing exercise.
Open Source Intelligence Tools
To close things out, we’ll take a look at some of the most commonly used tools for collecting and processing open source intelligence. While there are many free and useful tools available to security professionals and threat actors alike, some of the most commonly used (and abused) open source intelligence tools are search engines like Google — just not as most of us know them.
As we’ve already explained, one of the biggest issues facing security professionals is the regularity with which normal, well-meaning users accidentally leave sensitive assets and information exposed to the internet. There are a series of advanced search functions called “Google dork” queries that can be used to identify the information and assets they expose. Google dork queries are based on the search operators used by IT professionals and hackers on a daily basis to conduct their work. Common examples include “filetype:”, which narrows search results to a specific file type, and “site:”, which only returns results from a specified website or domain.
The Public Intelligence website offers a more thorough rundown of Google dork queries, in which they give the following example search:
“sensitive but unclassified” filetype:pdf site:publicintelligence.net
If you type this search term into a search engine, it returns only PDF documents from the Public Intelligence website that contain the words “sensitive but unclassified” somewhere in the document text. As you can imagine, with hundreds of commands at their disposal, security professionals and threat actors can use similar techniques to search for almost anything.
Moving beyond search engines, there are literally hundreds of tools that can be used to identify network weaknesses or exposed assets.
- Metadata search
- Code search
- People and identity investigation
- Phone number research
- Email search and verification
- Linking social media accounts
- Image analysis
- Geospatial research and mapping
- Wireless network detection and packet analysis