Documentation Index
Fetch the complete documentation index at: https://mintlify.com/NationalSecurityAgency/ghidra/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
PyGhidra is a Python library that provides direct access to the Ghidra API within a native CPython 3 interpreter using JPype. Originally developed by the Department of Defense Cyber Crime Center (DC3) as “Pyhidra”, it enables modern Python workflows with full Ghidra functionality.
Key Features
- Native CPython 3 - Use Python 3.x with modern syntax and libraries
- Standalone operation - Run Ghidra scripts outside the GUI
- Full API access - Complete access to Ghidra’s Java API
- Project management - Open, create, and manage Ghidra projects
- Type stubs - IDE autocomplete and type checking support
- Integration ready - Use Ghidra as part of larger Python workflows
Installation
Prerequisites
- Ghidra 12.0 or later installed
- Python 3.8 or later
- pip package manager
Install PyGhidra
Online installation:
Offline installation:
python3 -m pip install --no-index \
-f <GhidraInstallDir>/Ghidra/Features/PyGhidra/pypkg/dist \
pyghidra
Install Type Stubs (Optional)
For better IDE support:
# Ghidra type stubs (version-specific)
pip install ghidra-stubs==11.4
# Java type stubs
pip install java-stubs-converted-strings
Set Ghidra Installation Path
Option 1: Environment variable
export GHIDRA_INSTALL_DIR=/path/to/ghidra
Option 2: In code
import pyghidra
pyghidra.start(install_dir="/path/to/ghidra")
Quick Start
Basic Program Analysis
import pyghidra
# Initialize PyGhidra
pyghidra.start()
# Open a project and program
with pyghidra.open_project("/path/to/projects", "MyProject", create=True) as project:
# Import and analyze a binary
loader = pyghidra.program_loader().project(project)
loader = loader.source("/path/to/binary.exe").name("binary.exe")
with loader.load() as load_results:
load_results.save(pyghidra.task_monitor())
# Open the program
with pyghidra.program_context(project, "/binary.exe") as program:
# Analyze
pyghidra.analyze(program)
# Access program data
listing = program.getListing()
for func in listing.getFunctions(True):
print(f"{func.getName()} @ {func.getEntryPoint()}")
Legacy API (Simple)
import pyghidra
with pyghidra.open_program("binary.exe") as flat_api:
program = flat_api.getCurrentProgram()
listing = program.getListing()
# Iterate functions
for func in listing.getFunctions(True):
print(f"{func.getName()} @ {func.getEntryPoint()}")
Core API Reference
pyghidra.start()
Initialize Ghidra in headless mode:
import pyghidra
# Basic start
pyghidra.start()
# With custom installation
pyghidra.start(install_dir="/opt/ghidra")
# Verbose output
pyghidra.start(verbose=True)
# Check if already started
if not pyghidra.started():
pyghidra.start()
Project Management
Open or create project:
with pyghidra.open_project("/projects", "MyProject", create=True) as project:
# Work with project
print(f"Project: {project.getName()}")
Load program from file:
loader = pyghidra.program_loader()
loader = loader.project(project)
loader = loader.source("/path/to/binary.exe")
loader = loader.name("my_binary")
loader = loader.language("x86:LE:64:default")
with loader.load() as load_results:
load_results.save(pyghidra.task_monitor())
Access program:
# With context manager (auto-cleanup)
with pyghidra.program_context(project, "/binary.exe") as program:
# Use program
pass
# Manual management
program, consumer = pyghidra.consume_program(project, "/binary.exe")
try:
# Use program
pass
finally:
program.release(consumer)
Analysis Operations
Run analysis:
with pyghidra.program_context(project, "/binary.exe") as program:
# Analyze with default settings
log = pyghidra.analyze(program)
print(log)
# Analyze with timeout
monitor = pyghidra.task_monitor(timeout=60) # 60 seconds
log = pyghidra.analyze(program, monitor)
Configure analysis:
with pyghidra.program_context(project, "/binary.exe") as program:
# Get analysis properties
props = pyghidra.analysis_properties(program)
# Modify settings
with pyghidra.transaction(program, "Configure Analysis"):
props.setBoolean("Non-Returning Functions - Discovered", False)
props.setBoolean("Stack", True)
# Run analysis
pyghidra.analyze(program)
Transactions
All program modifications require transactions:
with pyghidra.program_context(project, "/binary.exe") as program:
from ghidra.program.model.listing import CodeUnit
# Use transaction context manager
with pyghidra.transaction(program, "Add Comment"):
listing = program.getListing()
addr = program.getMinAddress()
cu = listing.getCodeUnitAt(addr)
cu.setComment(CodeUnit.PLATE_COMMENT, "My comment")
# Save changes
program.save("Added comment", pyghidra.task_monitor())
Running GhidraScripts
# Run any GhidraScript (Java, Python, etc.)
with pyghidra.open_project("/projects", "MyProject") as project:
with pyghidra.program_context(project, "/binary.exe") as program:
stdout, stderr = pyghidra.ghidra_script(
"/path/to/MyScript.java",
project,
program,
echo_stdout=True,
echo_stderr=True
)
print("Script output:", stdout)
Advanced Usage
Walking Projects
Process all domain files:
def process_file(domain_file):
print(f"File: {domain_file.getName()}")
pyghidra.walk_project(
project,
process_file,
start="/",
file_filter=lambda f: f.getName().endswith(".exe")
)
Process all programs:
def process_program(domain_file, program):
print(f"Program: {program.getName()}")
listing = program.getListing()
func_count = listing.getFunctions(True).size()
print(f" Functions: {func_count}")
pyghidra.walk_programs(
project,
process_program,
program_filter=lambda f, p: not p.getName().startswith("test_")
)
Working with Filesystems
import os
# Open a filesystem (ZIP, TAR, etc.)
with pyghidra.open_filesystem("/path/to/archive.zip") as fs:
loader = pyghidra.program_loader().project(project)
# Load files from filesystem
for f in fs.files(lambda f: f.name.endswith(".dll")):
loader = loader.source(f.getFSRL())
loader = loader.projectFolderPath("/" + f.parentFile.name)
with loader.load() as load_results:
load_results.save(pyghidra.task_monitor())
Accessing the Decompiler
from ghidra.app.decompiler import DecompInterface
with pyghidra.program_context(project, "/binary.exe") as program:
# Initialize decompiler
decompiler = DecompInterface()
decompiler.openProgram(program)
try:
# Decompile a function
listing = program.getListing()
func = listing.getFunctions(True).next()
results = decompiler.decompileFunction(
func, 30, pyghidra.task_monitor()
)
if results.decompileCompleted():
decomp = results.getDecompiledFunction()
print(decomp.getC())
finally:
decompiler.dispose()
Memory Operations
with pyghidra.program_context(project, "/binary.exe") as program:
memory = program.getMemory()
# Read bytes
addr = program.getMinAddress()
byte_array = bytearray(16)
memory.getBytes(addr, byte_array)
print(" ".join(f"{b:02x}" for b in byte_array))
# Write bytes (requires transaction)
with pyghidra.transaction(program, "Write Memory"):
import jpype
ByteArray = jpype.JArray(jpype.JByte)
new_bytes = ByteArray([0x90, 0x90, 0x90, 0x90])
memory.setBytes(addr, new_bytes)
Symbol Operations
with pyghidra.program_context(project, "/binary.exe") as program:
from ghidra.program.model.symbol import SourceType
symbol_table = program.getSymbolTable()
# Find symbols
symbols = symbol_table.getSymbolIterator("main", True)
for sym in symbols:
print(f"{sym.getName()} @ {sym.getAddress()}")
# Create label
with pyghidra.transaction(program, "Create Label"):
addr = program.getMinAddress()
symbol_table.createLabel(
addr, "my_label", SourceType.USER_DEFINED
)
Real-World Examples
Example 1: Batch Binary Analysis
import pyghidra
import os
from pathlib import Path
pyghidra.start()
binaries = Path("/malware/samples").glob("*.exe")
with pyghidra.open_project("/analysis", "MalwareAnalysis", create=True) as project:
for binary in binaries:
print(f"Processing: {binary.name}")
# Load binary
loader = pyghidra.program_loader().project(project)
loader = loader.source(str(binary)).name(binary.name)
with loader.load() as load_results:
load_results.save(pyghidra.task_monitor())
# Analyze
with pyghidra.program_context(project, f"/{binary.name}") as program:
pyghidra.analyze(program, pyghidra.task_monitor(300))
# Extract function info
listing = program.getListing()
funcs = [f.getName() for f in listing.getFunctions(True)]
print(f" Functions: {len(funcs)}")
# Save
program.save("Analysis complete", pyghidra.task_monitor())
import pyghidra
import json
pyghidra.start()
with pyghidra.open_program("/path/to/binary.exe") as flat_api:
program = flat_api.getCurrentProgram()
listing = program.getListing()
functions = []
for func in listing.getFunctions(True):
func_info = {
"name": func.getName(),
"entry": str(func.getEntryPoint()),
"signature": func.getPrototypeString(False, False),
"params": [
{
"name": p.getName(),
"type": str(p.getDataType())
}
for p in func.getParameters()
],
"return_type": str(func.getReturnType())
}
functions.append(func_info)
# Export to JSON
with open("functions.json", "w") as f:
json.dump(functions, f, indent=2)
print(f"Exported {len(functions)} functions")
Example 3: Custom Analysis with Transactions
import pyghidra
pyghidra.start()
with pyghidra.open_project("/projects", "Analysis") as project:
with pyghidra.program_context(project, "/binary.exe") as program:
from ghidra.program.model.listing import CodeUnit
from ghidra.program.model.symbol import SourceType
listing = program.getListing()
symbol_table = program.getSymbolTable()
# Find and annotate string references
with pyghidra.transaction(program, "Annotate Strings"):
for func in listing.getFunctions(True):
if not func.getName().startswith("FUN_"):
continue
# Check for string references
body = func.getBody()
has_strings = False
for addr in body.getAddresses(True):
refs = program.getReferenceManager().getReferencesFrom(addr)
for ref in refs:
to_addr = ref.getToAddress()
data = listing.getDataAt(to_addr)
if data and "string" in str(data.getDataType()).lower():
has_strings = True
break
if has_strings:
break
# Rename if it uses strings
if has_strings:
new_name = f"str_{func.getEntryPoint()}"
func.setName(new_name, SourceType.USER_DEFINED)
# Add comment
cu = listing.getCodeUnitAt(func.getEntryPoint())
cu.setComment(CodeUnit.PLATE_COMMENT,
"Function uses string references")
# Save changes
program.save("String annotation", pyghidra.task_monitor())
print("Analysis complete")
Custom Launchers
For advanced JVM configuration:
from pyghidra.launcher import HeadlessPyGhidraLauncher
launcher = HeadlessPyGhidraLauncher()
launcher.add_classpaths("custom.jar", "lib/other.jar")
launcher.add_vmargs("-Xmx4g", "-Dmy.property=value")
launcher.start()
# Now use PyGhidra normally
import pyghidra
# pyghidra is already started via launcher
Package Name Conflicts
When Python modules conflict with Java packages:
import pdb # Python debugger
import pdb_ # Ghidra's pdb package
Best Practices
- Use context managers - Ensures proper resource cleanup
- Handle transactions - Always wrap modifications in transactions
- Set timeouts - Use task monitors with timeouts for long operations
- Save work - Call
program.save() after modifications
- Check started state - Use
pyghidra.started() before calling start()
- Release programs - Always release programs when done
Troubleshooting
Common Issues
ImportError: No module named pyghidra
Ghidra installation not found
export GHIDRA_INSTALL_DIR=/path/to/ghidra
JVM already started
if not pyghidra.started():
pyghidra.start()
Program locked
Ensure previous program instances are released:
program.release(consumer)
Migration from Jython
Key differences when migrating from Jython scripts:
| Jython 2 | PyGhidra (Python 3) |
|---|
print "text" | print("text") |
xrange() | range() |
| Auto state variables | Must access via program |
| GUI context | Standalone context |
.properties files | Python configuration |