dimRp

DimRp, short for dimensionality reduction by random projections, refers to techniques that map high-dimensional data into a lower-dimensional space by multiplying with a random projection matrix. The goal is to preserve approximately the distances between data points while enabling faster computation and lower memory usage, drawing on ideas from the Johnson-Lindenstrauss lemma.

The typical workflow starts with a dataset X in R^{n×d}. A random matrix R in R^{d×k} is

Variants include dense random projections and sparse variants (for example, schemes using entries from {−1, 0,

Key properties include probabilistic guarantees on distance preservation: with appropriate choice of k, pairwise distances are

Applications span large-scale machine learning pipelines, text mining, clustering, approximate nearest-neighbor search, and information retrieval, where

See also: Johnson-Lindenstrauss lemma; random projection; dimensionality reduction; PCA.

References: Johnson and Lindenstrauss (1984); Achlioptas (2003); Bingham and Mannila (2001).

k

<

=

X

R

R

multiplications.

implementations

multiplication.

data-independent