GRIT Documentation
GRIT (Genomic Range Interval Toolkit) is a high-performance Rust implementation of common BED file operations. It provides streaming algorithms with O(k) memory complexity, achieving 3-15x speedup over bedtools with up to 1000x less memory.
Installation
# From crates.io (recommended)
cargo install grit-genomics
# From Homebrew (macOS/Linux)
brew install manish59/grit/grit
# From Bioconda (coming soon)
conda install -c bioconda grit-genomics
Quick Start
# Sort a BED file
grit sort -i unsorted.bed > sorted.bed
# Find overlapping intervals (streaming mode for large files)
grit intersect -a regions.bed -b reads.bed --streaming --assume-sorted
# Merge overlapping intervals
grit merge -i intervals.bed --assume-sorted
# Calculate coverage
grit coverage -a genes.bed -b reads.bed --assume-sorted
Commands
| Command | Description |
|---|---|
| sort | Sort a BED file by chromosome and position |
| merge | Merge overlapping intervals |
| intersect | Find overlapping intervals between two BED files |
| subtract | Remove intervals in A that overlap with B |
| closest | Find the closest interval in B for each interval in A |
| window | Find intervals in B within a window of A |
| coverage | Calculate coverage of A intervals by B intervals |
| slop | Extend intervals by a given number of bases |
| complement | Return intervals NOT covered by the input |
| genomecov | Compute genome-wide coverage |
| jaccard | Calculate Jaccard similarity between two BED files |
| multiinter | Identify common intervals across multiple files |
| generate | Generate synthetic BED datasets for testing |
Global Options
All commands support these global options:
| Option | Description |
|---|---|
-t, --threads <N> | Number of threads (default: number of CPUs) |
--bedtools-compatible | Match bedtools behavior for zero-length intervals |
-h, --help | Print help |
-V, --version | Print version |
Performance Tips
For maximum performance with sorted input files:
# Use --streaming for constant memory usage
grit intersect -a large_a.bed -b large_b.bed --streaming --assume-sorted
# Skip sort validation if you know input is sorted
grit merge -i sorted.bed --assume-sorted
Benchmarks
Tested on 10M × 5M intervals (full methodology):
| Command | Speedup | Memory Reduction |
|---|---|---|
| window | 15.3x | 137x less |
| merge | 10.8x | ~same |
| coverage | 9.0x | 134x less |
| subtract | 6.5x | 19x less |
| closest | 5.0x | 59x less |
| intersect | 4.4x | 19x less |
| jaccard | 3.1x | 1230x less |
See detailed benchmarks for methodology and reproducibility instructions.
Guides
- Migrating from bedtools - Drop-in replacement guide
- Input Validation - Sort order & genome validation
- Streaming Model - O(k) memory algorithms
- Architecture - Internal design