Home

cdH

CDH stands for Cloudera Distribution Including Apache Hadoop. It is a legacy distribution of Apache Hadoop and related big data software packaged and supported by Cloudera for enterprise use. CDH combined core Hadoop components with additional ecosystem projects to provide a ready-to-deploy data platform.

The distribution includes the foundational Hadoop stack—HDFS for storage and YARN for resource management—along with processing

CDH emphasizes ease of administration in on‑premises and private cloud environments, supporting multi-node architectures with centralized

Licensing and ecosystem context: CDH is built on Apache Hadoop and other Apache projects, with additional proprietary

engines
such
as
MapReduce
and
Apache
Spark.
It
also
bundles
data
warehousing
and
SQL
tools
like
Apache
Hive
and
Impala,
NoSQL
support
with
Apache
HBase,
data
movement
and
workflow
tools
such
as
Apache
Sqoop,
Flume,
and
Oozie,
and
user
interfaces
through
Hue.
Cloudera
Manager
provides
deployment,
configuration,
monitoring,
and
life-cycle
management,
while
Parcels
or
packages
distribute
software
across
clusters.
Security
features
typically
include
Kerberos-based
authentication
and
authorization
controls
via
Sentry
or
Ranger,
with
encryption
options
available
for
data
in
transit
and
at
rest.
governance
and
auditing.
It
integrates
with
the
broader
Cloudera
ecosystem
for
governance,
data
lineage,
and
metadata
management.
management
and
support
layers
from
Cloudera.
Since
the
late
2010s,
CDH
has
been
largely
superseded
by
Cloudera
Data
Platform
(CDP)
following
Cloudera’s
2019
merger
with
Hortonworks.
While
CDH
remains
in
use
in
some
environments,
CDP
is
now
the
primary
development
and
deployment
platform,
offering
unified
data
services
across
on‑premises
and
the
cloud.