Home

dbGaP

dbGaP, the database of Genotypes and Phenotypes, is a repository for archiving and distributing data from studies that investigate the relationships between genetic variation and human traits. It is managed by the National Center for Biotechnology Information (NCBI), a division of the National Institutes of Health (NIH). The database stores genotype and phenotype data from human studies, including genome-wide association studies, sequencing projects, and other genetic investigations, along with study metadata, consent information, and data-use restrictions.

Data submission and content

Researchers deposit data through study submitters who prepare submission packages that describe the study and the

Access and governance

Access to controlled data is governed by Data Access Committees (DACs) that review Data Access Requests (DARs).

Impact and purpose

dbGaP supports secondary analyses and reproducible research in human genetics by enabling controlled sharing of genotype

data.
dbGaP
includes
genotype
data,
sequencing
data,
phenotype
measurements,
and
supporting
metadata.
Access
to
the
detailed
data
is
typically
controlled,
while
some
aggregated
or
de-identified
information
may
be
more
openly
available.
The
system
is
designed
to
connect
data
with
the
consent
and
governance
framework
that
accompanies
each
study.
Applicants
provide
a
research
plan,
institutional
assurances,
and
sign
a
Data
Use
Certification.
Approved
users
gain
access
to
the
data
through
the
dbGaP
interface
and
must
comply
with
Data
Use
Limitations
encoded
by
the
Data
Use
Ontology
(DUO).
DUO
terms
express
permissible
uses
(such
as
Health/Medical
Research)
and
restrictions
(for
example,
prohibitions
on
re-identification
or
commercial
use).
and
phenotype
data,
while
protecting
participant
privacy
through
consent-based
restrictions
and
access
controls.
It
serves
as
a
key
resource
within
the
NIH
data-sharing
ecosystem
and
the
broader
genomics
research
community.