Migrating from bedtools

GRIT is designed as a drop-in replacement for bedtools. This guide covers command mappings, GRIT-specific optimizations, and common workflows.

Command Comparison

bedtools GRIT (basic) GRIT (optimized)
bedtools intersect -a A.bed -b B.bed grit intersect -a A.bed -b B.bed grit intersect -a A.bed -b B.bed --streaming --assume-sorted
bedtools intersect -a A.bed -b B.bed -sorted grit intersect -a A.bed -b B.bed grit intersect -a A.bed -b B.bed --streaming --assume-sorted
bedtools subtract -a A.bed -b B.bed grit subtract -a A.bed -b B.bed grit subtract -a A.bed -b B.bed --streaming --assume-sorted
bedtools merge -i A.bed grit merge -i A.bed grit merge -i A.bed --assume-sorted
bedtools closest -a A.bed -b B.bed grit closest -a A.bed -b B.bed grit closest -a A.bed -b B.bed --streaming --assume-sorted
bedtools coverage -a A.bed -b B.bed -sorted grit coverage -a A.bed -b B.bed grit coverage -a A.bed -b B.bed --assume-sorted
bedtools window -a A.bed -b B.bed -w 1000 grit window -a A.bed -b B.bed -w 1000 grit window -a A.bed -b B.bed -w 1000 --assume-sorted
bedtools sort -i A.bed grit sort -i A.bed grit sort -i A.bed
bedtools slop -i A.bed -g genome.txt -b 100 grit slop -i A.bed -g genome.txt -b 100 Same
bedtools complement -i A.bed -g genome.txt grit complement -i A.bed -g genome.txt grit complement -i A.bed -g genome.txt --assume-sorted
bedtools jaccard -a A.bed -b B.bed grit jaccard -a A.bed -b B.bed Same

Key GRIT Flags

Flag Description When to Use
--streaming O(k) memory mode Large files (>1GB), memory-constrained systems
--assume-sorted Skip sort validation Pre-sorted files for faster startup
--allow-unsorted Auto-sort in memory Unsorted input (uses more memory)
-g, --genome Validate chromosome order Ensure genome-specific ordering
--bedtools-compatible Match bedtools behavior Zero-length interval handling

Performance Modes

# Basic (validates input, loads into memory)
grit intersect -a A.bed -b B.bed

# Streaming (constant memory, requires sorted input)
grit intersect -a A.bed -b B.bed --streaming

# Maximum performance (skip validation, streaming)
grit intersect -a A.bed -b B.bed --streaming --assume-sorted

# Handle unsorted input (auto-sorts in memory)
grit intersect -a unsorted.bed -b B.bed --allow-unsorted

Common Workflow Migration

bedtools workflow

bedtools sort -i raw.bed > sorted.bed
bedtools merge -i sorted.bed > merged.bed
bedtools intersect -a merged.bed -b features.bed -sorted > result.bed

GRIT equivalent (faster)

grit sort -i raw.bed > sorted.bed
grit merge -i sorted.bed --assume-sorted > merged.bed
grit intersect -a merged.bed -b features.bed --streaming --assume-sorted > result.bed

GRIT pipeline (fastest - no intermediate files)

grit sort -i raw.bed | grit merge -i - --assume-sorted | grit intersect -a - -b features.bed --streaming --assume-sorted > result.bed

Global Options

All commands support these options:

Option Description
-t, --threads <N> Number of threads (default: all CPUs)
--bedtools-compatible Normalize zero-length intervals to 1bp for bedtools parity
-h, --help Show help for any command
-V, --version Show version
# Run with 4 threads
grit -t 4 intersect -a file1.bed -b file2.bed

# Enable bedtools-compatible mode for zero-length intervals
grit --bedtools-compatible intersect -a snps.bed -b features.bed

# Get help for a specific command
grit intersect --help

Zero-Length Interval Differences

GRIT uses strict half-open interval semantics by default. Zero-length intervals (start == end) contain no bases and don’t overlap with anything.

bedtools treats zero-length intervals as 1bp intervals. To match this behavior:

grit --bedtools-compatible intersect -a snps.bed -b features.bed

See Input Validation for more details.

Streaming Mode

For very large files, streaming mode processes data with constant O(k) memory:

# Intersect
grit intersect -a a.bed -b b.bed --streaming > result.bed

# Subtract
grit subtract -a a.bed -b b.bed --streaming > result.bed

# Closest
grit closest -a a.bed -b b.bed --streaming > result.bed

Memory comparison:

Mode Memory Usage Best For
Default (parallel) O(n + m) Maximum speed
Streaming O(k) ≈ 2 MB Large files, low RAM

Validation Differences

Scenario bedtools GRIT
Unsorted input Silent wrong results Error with fix suggestion
Wrong chromosome order Depends on command Error with -g flag
Zero-length intervals Treats as 1bp Strict (or --bedtools-compatible)

GRIT validates input by default to prevent silent failures. See Input Validation for details.