Home

GEOquery

GEOquery is an R package from Bioconductor that provides programmatic access to the NCBI Gene Expression Omnibus (GEO). It enables researchers to retrieve GEO series (GSE), platforms (GPL), and samples (GSM), as well as associated metadata and supplementary files, directly from within R. The package is designed to support reproducible data analysis workflows by converting downloaded data into R data structures suitable for downstream analysis.

Core functionality includes the getGEO function, which downloads GEO data and typically returns an ExpressionSet or

Output and interoperability: GEOquery objects integrate with Bioconductor data classes, including ExpressionSet and SummarizedExperiment, enabling downstream

Usage note: Retrieval can be bandwidth- and time-intensive for large series, and some GEO data may require

See also: Gene Expression Omnibus, GEOmetadb, Bioconductor, ExpressionSet, SummarizedExperiment.

a
list
of
ExpressionSet
objects
when
multiple
series
are
requested,
optionally
including
processed
data
matrices.
It
can
fetch
data
in
several
formats,
such
as
Series
Matrix,
SOFT,
and
MINiML,
and
it
can
also
retrieve
supplementary
files
for
a
series
via
getGEOSuppFiles.
The
package
handles
metadata
extraction,
including
platform
annotations
and
sample
characteristics,
allowing
mapping
of
probe
IDs
to
gene
identifiers.
analysis
with
limma,
edgeR,
or
other
packages.
The
tool
works
in
concert
with
GEOmetadb
for
fast
local
access
to
GEO
metadata,
and
it
follows
Bioconductor
conventions
for
package
maintenance
and
documentation.
multiple
requests
or
extra
parsing
steps.
Users
should
ensure
compatible
R
and
Bioconductor
versions
and
consider
caching
results
to
reduce
network
load.