Home

obsm

obsm is a field of the AnnData data structure used in single-cell omics analysis (notably with Scanpy). It stores per-observation multidimensional representations, such as coordinate matrices for cells in low-dimensional embeddings or other per-observation projections.

Technically, obsm is a dictionary-like container that maps string keys to numpy arrays with shape (n_obs, n_dims).

Access and usage are straightforward. For example, coordinates = adata.obsm['X_umap'] yields the UMAP coordinates for all observations.

obsm complements other AnnData fields. While adata.X stores the primary expression data and adata.varm can hold

Memory considerations apply, as large embeddings increase the in-memory footprint. Users typically populate obsm with embeddings

Here,
n_obs
is
the
number
of
observations
(for
example,
cells)
and
n_dims
is
the
dimensionality
of
the
embedding
(commonly
2
or
3).
Each
key
identifies
a
distinct
embedding
or
projection.
Typical
keys
include
'X_pca',
'X_umap',
'X_tsne',
and
'X_diffmap',
though
the
names
can
vary.
The
rows
of
obsm
arrays
align
with
the
rows
of
adata.obs,
meaning
the
i-th
row
in
an
obsm
array
corresponds
to
the
i-th
observation
described
in
adata.obs.
per-variable
embeddings,
obsm
stores
per-observation
embeddings.
This
organization
facilitates
downstream
visualization
and
analysis,
such
as
plotting
2D
or
3D
embeddings,
clustering,
or
trajectory
inference
results
that
are
tied
to
individual
observations.
produced
from
analyses
like
PCA
or
manifold
learning
and
then
reuse
or
share
those
coordinates
for
plotting
and
further
analysis.