OSINT for Digital Forensics
Crime is constantly evolving in today’s digitally interconnected world. The complexity of digital evidence and the volume of data investigators are collecting from myriad data sources are overwhelming police agencies.
According to some estimates, digital evidence is a factor in about 90 percent of criminal cases. Evidence is being retrieved from mobile devices, computers, digital video recorders (DVRs), and Internet of Things (IoT) devices.
Police agencies realize there must be a better way to extract data from devices, process the data, and ensure evidence gets to the right stakeholders and investigators in a timely manner.
Digital forensic investigation and AI-powered open-source intelligence (OSINT) are playing crucial roles in modernizing police investigations. They are helping to save lives and bring criminals to justice by increasing the police’s ability to locate missing persons, find victims of human trafficking, identify criminals in fraud and extortion cases, and identify the sources of illicit drugs, as well as assisting in solving other active or ongoing investigations.
Data Extraction from a Digital Forensics Perspective
To effectively recover, analyze, and report on digital evidence, agencies need to adopt a toolbox approach that can validate results. The truth is in the data. Being able to validate what a victim, witness, or suspect says happened and having the evidence that supports those statements are critical.
Digital evidence must be delivered to investigators in a timely manner, giving them the actionable intelligence needed to follow new leads. Agencies that are adopting the latest digital forensic approaches are receiving actionable insights within 24 to 36 hours of a serious crime. Historically, evidence has been on a desktop computer, and analysts used a desktop forensics solution to retrieve and review evidence. Now, with capabilities such as automation and orchestration, examiners can automatically collect, process, analyze, and share digital forensic evidence.
Data Extraction from an OSINT Perspective
OSINT, on the other hand, provides seamless search and analysis of publicly available sources, open, deep, and dark web, as well as integrated data sources, allowing the police to generate actionable insights. OSINT is a multifactor methodology for collecting, analyzing, and making decisions about data accessible in publicly available sources to be used in an intelligence context.
The internet is loaded with information—some real, some not so real. The information is not always verified or vetted. A name search could return multiple results, but it might not be the person an investigator is looking for. An OSINT analyst looks for unique identifiers.
Some people might think a phone number is a unique identifier, but phone numbers can be changed. An email is a more consistent unique identifier. Email stays with a person forever. People still have old email accounts that they have not used in years, but nobody else is using them either. Unique identifiers are important because they link back to the forensics, typically information downloaded from a phone or device during an investigation. There are phone numbers, contact names, nicknames, or a suspect’s “street name.” Sometimes those pieces of information corroborate each other and help investigators identify a target or suspect.
All of this information must be verified. That is the function of OSINT. In some cases, an investigator or analyst searches the web and thinks they are on the right track. However, there are so many digital breadcrumbs on the web, and as analysts collect information from open-source platforms, they must make sure they are collecting the right information. One wrong move in the investigation can derail the case in court.
How OSINT Helps Digital Forensics
Gathering Information
OSINT can help digital forensics investigators gather information about suspects, their activities, and their contacts. For example, social media profiles and online forums can provide valuable information about a suspect’s interests, hobbies, and other personal information that may be relevant to an investigation.
Reconstructing Events
OSINT can help digital forensics investigators reconstruct events related to a crime. An example could be news articles and other public sources of information can provide additional details and context that can be used to support an investigation.
Verifying Information
OSINT can help digital forensics investigators verify information that has been obtained from other sources, such as digital devices. OSINT can help investigators determine if a suspect has a history of criminal activity or if they have made any public statements about their activities.
Finding Hidden Data
OSINT can help digital forensics investigators find hidden data that may be relevant to an investigation. Digital forensics investigators can use OSINT to search for public information about a suspect’s activities that may not have been found through a traditional digital forensics investigation.
Top Open Source Digital Forensics Tools
Paid Tools:
Knowlesys Intelligence System
Knowlesys offers a comprehensive solution for digital forensics by helping investigators link real-world individuals to their digital identities across various platforms.
Key capabilities include Digital Profiling, where social media and dark web content are analyzed to create a digital footprint of a subject, and Image Classification via AI, which streamlines the identification of critical visual evidence. Additionally, Knowlesys provides Cryptocurrency Tracking, Social Network Analysis, and Cloud Forensics, enabling the investigation of encrypted data and activities on deep and dark web platforms, making it invaluable for modern forensics.
HashKeeper
HashKeeper is a database application that is primarily of value to users who perform regular forensic checks on their computers.
HashKeeper uses the MD5 file signing algorithm to establish unique numeric identifiers (hash values) for "known-good" and "known-bad" files.
HashKeeper is designed to reduce the time required to scan digital media files. If the examiner defines the document as a known-good file, the examiner does not need to repeat the analysis.
HashKeeper compares a hash of a known-good file to a hash of a file on a computer system. If this value matches a file that is known to be healthy, the inspector can say with considerable certainty that the file on the computer system is healthy and does not need to be checked.
If these values match a file that is known to be bad, the inspector can say with considerable certainty that the file on the system being scanned is bad and needs further review.
Free Tools:
The Sleuth Kit
The Sleuth Kit (TSK) offers a variety of Windows-and Unix-based utilities and libraries for data extraction forensic analysis. It forms the basis of the well-known tool Autopsy, a graphical user interface (GUI) for command-line utilities packaged with TSK. It allows users to extract data from storage devices and disk drives.
TSK is an open source collection protected under general public, common public, and IP licensing. The software is actively developed and supported by an open source community.
The Sleuth Kit is compatible with NTFS, FAT/ExFAT, UFS 1/2, Ext2, Ext3, Ext4, HFS, ISO 9660, and YAFFS2 file systems in standalone or raw (dd), expert witness, or advanced forensic format (AFF). You can use the Sleuth Kit to check most Microsoft Windows and Apple Macintosh operating systems, as well as many Linux and certain UNIX computers.
The Sleuth Kit is available through the included command line tool or as a library built into other digital forensics tools such as log2timeline/plasmo and Autopsy.
Autopsy
Autopsy is computer software that facilitates the distribution of many of the open source programs and plug-ins used in TKS. The GUI displays forensic search results for underlying volumes, so investigators can more easily flag relevant pieces of data.
Autopsy file parsing capabilities include:
· Hash all files and decompresses standard archives such as ZIP and JAR.
· Extract EXIF values and major file systems such as FAT, ExFAT, NTFS, HFS+, Ext2/Ext3/Ext4, and YAFFS2 for analysis and to add keywords in an index.
· Parse and catalog some specialized file formats such as email formats and contact files.
Users can search for recent activity in this index file and create a summarizing report in PDF or HTML format. If time is limited, users can use rules to enable analysis of the most important files first. In Autopsy, some images of these files may be saved in virtual hard disk format.
Digital Forensics Framework
Digital Forensics Framework (DFF) is an open source computer forensics solution. Professionals and non-experts use it to collect, store, and disclose digital evidence without compromising systems or data.
DFF provides a GUI developed with PyQt and traditional tree views, as well as a rich command line interface. Features such as recursive views, tags, real-time searches, and bookmarks are available.
DFF comes with common shell features such as completion, task management, and keyboard shortcuts. DFF can automate repetitive tasks by running batch scripts at startup. Advanced users and developers can script investigations using DFF directly in the Python interpreter.
In addition to source package and binary installers for Linux and Windows, DFF is also available in popular operating system distributions including Debian, Fedora, and Ubuntu.
Open Computer Forensics Architecture (OCFA)
Open Computer Forensics Architecture (OCFA) is an open source computer forensics framework for analyzing digital media in digital forensics laboratory environments. This framework was established by the Dutch police.
OCFA is a backend for the Linux platform. It uses a PostgreSQL database for datastores, custom content addressable stores or CarvFS-based datastores, and Lucene indexes. Due to licensing issues, the OCFA frontend is not open to the public.
The framework can integrate with additional open source forensics tools and includes modules for TKS, Scalpel, libmagic, Photorec, GNU Privacy Guard, exiftags, objdump, zip, 7-zip, gzip, bzip2, tar, rar, antiword, mbx2mbox, and qemu-img. OCFA is extensible to C++ or Java.
Bulk Extractor
Bulk Extractor is an information extraction solution that scans files, directories, or disk images and extracts data without parsing file systems or file system structures. This allows parallel access to various parts of the disk, which is faster than regular tools.
Another advantage of Bulk Extractor is its ability to handle almost any format of digital media, including hard drives, optical drives, solid-state drives, camera cards, and smartphones. Its latest version can perform social network forensics, analyzing digital evidence to extract information such as addresses, credit card numbers, and URLs.
Other features include histogram creation based on compile word lists and frequently-used email addresses. This is useful for decryption.
All the information extracted can be processed manually or using one of four automated tools. One has a built-in contextual stop list (i.e., search terms marked by investigators) that excludes human error from digital forensic investigations. The software is available free of charge for Windows and Linux systems.
Computer-Aided Investigative Environment
Computer-Aided Investigative Environment (CAINE) is an open-source Ubuntu and Linux-based distribution created by Italian developers for digital forensics. CAINE provides interoperable software that integrates with existing security tools to provide a user-friendly GUI. Because it is open source, organizations can redistribute and modify Windows, Linux, and Unix systems as needed.
CAINE integrates software tools into modules through powerful scripting in a graphical interface environment. Its production environment is designed to provide forensic professionals with all the tools they need to conduct digital forensic research processes (storage, collection, inspection, and analysis).
Because CAINE is a live Linux distribution, it can be booted from removable media (flash drive) or CD and run from memory. It can also be installed on a physical or virtual machine. In real-time mode, CAINE can process datastore objects without starting a supporting operating system.
The latest version (11.0) is bootable with UEFI / UEFI + secure and legacy BIOS, enabling the use of CAINE in information systems booting legacy operating systems (e.g., Windows NT) and new platforms (Linux, Windows 10).
SANS Investigative Forensics Toolkit
SANS Investigative Forensics Toolkit (SIFT) is a suite of open source forensics and incident response technologies designed to conduct in-depth investigations in various digital environments. The toolkit securely scans the original disk and multiple file types and does it in a secure, read-only manner to preserve the evidence it finds.
SIFT has high flexibility and compatibility with raw evidence formats, expert witness format (E01), and advanced forensics format (AFF). It is based on Ubuntu and includes several individual tools (including some mentioned here) that forensic investigators can use for free. SIFT is updated regularly.
Conclusion
OSINT has increasingly become an extremely valuable tool for digital forensics investigators. By gathering information, reconstructing events, verifying information, and finding hidden data, OSINT can help digital forensics investigators more effectively support their investigations and bring criminals to justice.