Pandas Cheat Sheet for OSINT

OSINT (Open Source Intelligence) is a fascinating field that involves collecting and analyzing publicly available data from various sources. The Pandas library in Python is an essential tool for handling structured data in OSINT.

Importing Libraries

The first step is to import the necessary libraries:

import pandas as pd
from io import StringIO
from bs4 import BeautifulSoup

Data Types

Pandas supports several data types, including:

Data Structures

Pandas provides two primary data structures:

Merging DataFrames

Merging DataFrames is a crucial operation in OSINT:

import pandas as pd

# create DataFrames
df1 = pd.DataFrame({'Name': ['John', 'Anna', 'Peter'],
                   'Age': [28, 24, 35]})

df2 = pd.DataFrame({'Name': ['John', 'Anna', 'Linda'],
                   'City': ['New York', 'Paris', 'Berlin']})

# merge DataFrames
merged_df = pd.merge(df1, df2, on='Name')

print(merged_df)

Data Cleaning

Data cleaning is essential in OSINT to remove unwanted data:

import pandas as pd

# create a DataFrame with missing values
df = pd.DataFrame({'Name': ['John', 'Anna', 'Peter'],
                   'Age': [28, None, 35]})

# drop rows with missing values
clean_df = df.dropna()

print(clean_df)

Data Visualization

Data visualization is a powerful tool in OSINT to represent data:

import pandas as pd
import matplotlib.pyplot as plt

# create a DataFrame
df = pd.DataFrame({'Country': ['USA', 'Canada', 'Mexico'],
                   'Visitors': [100, 200, 50]})

# plot bar chart
plt.figure(figsize=(10,6))
plt.bar(df['Country'], df['Visitors'])
plt.title('Number of Visitors')
plt.xlabel('Country')
plt.ylabel('Visitors')
plt.show()

Conclusion

This cheat sheet has covered the basics of Pandas library in Python for OSINT. With this knowledge, you can effectively handle structured data and perform various operations such as importing libraries, data types, merging DataFrames, cleaning data, and visualizing data.