Home

dataintensiva

Dataintensiva is a term used in information technology to describe systems, applications, or processes that rely heavily on data collection, storage, processing, and analysis. It emphasizes data as the dominant resource in the workload and often implies the need for scalable storage, fast data movement, and parallel processing. The term is commonly used to distinguish data-intensive workloads from compute-intensive ones.

Key characteristics include large data volumes (volume), high-velocity data streams (velocity), diverse data formats and sources

Common domains include e-commerce personalized recommendations, real-time analytics on social networks, scientific simulations, genomics, climate modeling,

Key architectural patterns include lambda and kappa architectures, data-centric design, metadata management, data lineage, governance, and

Data-intensive computing is closely linked to the broader field of big data, data engineering, and cloud-native

(variety),
and
concerns
about
data
quality
and
accuracy
(veracity).
Achieving
acceptable
performance
typically
requires
distributed
storage
and
compute,
such
as
clustered
file
systems,
data
lakes
or
data
warehouses,
and
parallel
processing
frameworks
(for
example
Hadoop,
Spark)
as
well
as
streaming
platforms
(Kafka,
Flink).
Internet
of
Things,
and
large-scale
log
analysis.
These
workloads
often
employ
data
pipelines,
ETL/ELT
processes,
and
architectures
that
decouple
storage
from
compute
to
scale
cost-effectively
(data
lake,
lakehouse,
or
warehouse
patterns).
security.
Challenges
include
ensuring
data
quality
and
governance,
managing
storage
and
compute
costs,
meeting
latency
requirements,
and
complying
with
privacy
regulations.
architectures.
The
term
underscores
the
importance
of
scalable
data
infrastructure
and
governance
in
extracting
timely
insights
from
large
and
diverse
data
assets.