Home

PDBformat

PDBformat is a plain-text file format used for representing three-dimensional structures of biological macromolecules, primarily for data exchanged with the Protein Data Bank. It originated at Brookhaven National Laboratory in the 1970s and remains widely used because of its readability and broad software support, despite newer formats available. A PDB file encodes atomic coordinates, residue identities, and connectivity, along with metadata such as titles, authors, and experimental details.

A PDB file consists of records, each occupying a line with fixed-width fields. The most common records

Limitations of the fixed-column PDB format include restricted data types and field widths, which can hinder

are
ATOM
and
HETATM,
which
carry
Cartesian
coordinates
of
each
atom
(x,
y,
z
in
angstroms),
along
with
occupancy
and
temperature
factor.
Each
ATOM/HETATM
line
includes
the
atom
name,
residue
name,
chain
identifier,
residue
sequence
number,
and
an
optional
insertion
code.
Residue
names
use
standard
three-letter
codes,
and
the
file
may
also
indicate
alternate
locations
for
atoms.
In
addition
to
atoms,
PDB
files
may
include
TER
records
to
mark
chain
termini,
MODEL
and
ENDMDL
for
multi-model
structures
(such
as
NMR
ensembles
or
alternate
X-ray
models),
and
CONECT
records
that
describe
explicit
bonds.
Metadata
is
stored
in
records
such
as
HEADER,
TITLE,
COMPND,
SOURCE,
and
AUTHOR,
while
CRYST1
records
provide
unit-cell
parameters
for
crystallographic
structures.
large
structures
and
complex
metadata.
This
has
led
to
the
adoption
of
mmCIF
(PDBx
format)
as
a
more
flexible
successor.
Nevertheless,
the
PDB
format
remains
in
active
use
due
to
historical
datasets
and
broad
compatibility,
with
many
programs
able
to
read,
write,
and
convert
between
PDB
and
other
formats.