GRIT Documentation

GRIT (Genomic Range Interval Toolkit) is a high-performance Rust implementation of common BED file operations. It provides streaming algorithms with O(k) memory complexity, achieving 3-15x speedup over bedtools with up to 1000x less memory.

Installation

# From crates.io (recommended)
cargo install grit-genomics

# From Homebrew (macOS/Linux)
brew install manish59/grit/grit

# From Bioconda (coming soon)
conda install -c bioconda grit-genomics

Quick Start

# Sort a BED file
grit sort -i unsorted.bed > sorted.bed

# Find overlapping intervals (streaming mode for large files)
grit intersect -a regions.bed -b reads.bed --streaming --assume-sorted

# Merge overlapping intervals
grit merge -i intervals.bed --assume-sorted

# Calculate coverage
grit coverage -a genes.bed -b reads.bed --assume-sorted

Commands

Command Description
sort Sort a BED file by chromosome and position
merge Merge overlapping intervals
intersect Find overlapping intervals between two BED files
subtract Remove intervals in A that overlap with B
closest Find the closest interval in B for each interval in A
window Find intervals in B within a window of A
coverage Calculate coverage of A intervals by B intervals
slop Extend intervals by a given number of bases
complement Return intervals NOT covered by the input
genomecov Compute genome-wide coverage
jaccard Calculate Jaccard similarity between two BED files
multiinter Identify common intervals across multiple files
generate Generate synthetic BED datasets for testing

Global Options

All commands support these global options:

Option Description
-t, --threads <N> Number of threads (default: number of CPUs)
--bedtools-compatible Match bedtools behavior for zero-length intervals
-h, --help Print help
-V, --version Print version

Performance Tips

For maximum performance with sorted input files:

# Use --streaming for constant memory usage
grit intersect -a large_a.bed -b large_b.bed --streaming --assume-sorted

# Skip sort validation if you know input is sorted
grit merge -i sorted.bed --assume-sorted

Benchmarks

Tested on 10M × 5M intervals (full methodology):

Command Speedup Memory Reduction
window 15.3x 137x less
merge 10.8x ~same
coverage 9.0x 134x less
subtract 6.5x 19x less
closest 5.0x 59x less
intersect 4.4x 19x less
jaccard 3.1x 1230x less

See detailed benchmarks for methodology and reproducibility instructions.

Guides