Home

ORFs

An open reading frame (ORF) is a stretch of nucleotides that has the potential to be translated into a polypeptide. In DNA, an ORF is defined as a sequence beginning with a start codon and ending with a stop codon, and it must be in-frame, i.e., the triplet codons are read in groups of three without encountering a stop codon. Because DNA is double stranded and has three possible reading frames per strand, six reading frames exist in total. An ORF can exist on either strand and in any frame.

In practice, ORFs are identified during genome annotation by scanning for start and stop codons and evaluating

Because short ORFs occur by chance, many annotations apply a minimum length threshold (for example 100 codons)

An ORF is not guaranteed to be a gene; it may be nonfunctional, a pseudogene, or a

ORF annotation is central to genome annotation, comparative genomics, and gene discovery, including in prokaryotic, eukaryotic,

potential
coding
capacity.
In
bacteria
and
organelles,
the
start
codon
is
commonly
AUG,
though
alternative
start
codons
may
be
used;
in
eukaryotes,
initiation
context
(Kozak
sequence)
influences
recognition
of
the
start
codon,
and
prokaryotes
may
rely
on
Shine-Dalgarno
sequences
upstream
of
the
start
codon.
to
focus
on
likely
protein-coding
genes.
However,
smaller
ORFs,
termed
short
or
small
ORFs
(sORFs),
have
been
found
to
encode
functional
peptides,
and
they
complicate
annotation.
regulatory
element.
Some
genomic
regions
contain
overlapping
ORFs
or
alternate
reading
frames,
reflecting
complex
transcription
and
translation.
Experimental
evidence
such
as
transcript
abundance,
ribosome
profiling,
or
proteomics
supports
coding
status.
organellar,
and
viral
genomes,
as
well
as
metagenomic
datasets.