Home

bustools

Bustools is a collection of open-source command-line tools for processing single-cell RNA sequencing data using the BUS format (barcodes, UMIs, and gene sets). It is designed to work with the BUS files produced by the kallisto bus workflow and to convert raw sequencing reads into gene–cell expression matrices. The BUS format is a compact representation of barcodes, UMIs, and transcript assignments for each read, enabling efficient counting at scale.

Bustools provides modular utilities for end-to-end processing of scRNA-seq data. Key components include tools for sorting

Typical workflow: sequencing reads are pseudoaligned by kallisto to produce a BUS file; bustools sorts and,

Bustools is an open-source project hosted on GitHub as part of the kallisto-bustools ecosystem, maintained by

BUS
files,
correcting
barcode
sequences
against
a
reference
whitelist,
and
counting
molecules
by
gene
for
each
cell.
It
supports
UMI-aware
counting
and
can
handle
multiple
samples,
allowing
users
to
merge
or
separate
results
as
needed.
The
suite
is
compatible
with
popular
scRNA-seq
platforms
such
as
10x
Genomics
and
Drop-seq,
and
it
can
incorporate
transcript-to-gene
mappings
to
generate
gene-level
counts.
if
desired,
corrects
barcodes;
bustools
count
tallies
unique
molecular
identifiers
per
gene
per
barcode
to
produce
a
cell-by-gene
count
matrix,
which
can
then
be
used
in
downstream
analyses
with
standard
tools.
Bustools
emphasizes
efficiency
and
scalability,
enabling
processing
of
large
datasets
with
modest
computational
resources.
contributors
from
the
single-cell
analysis
community.
It
is
widely
referenced
in
single-cell
RNA-seq
pipelines
for
its
speed
and
straightforward
interface.