Home

checkswhere

Checkswhere is a software library and rule-engine designed to perform conditional data validation in data processing pipelines. It allows developers to express validation rules that should apply only to data items meeting specific conditions, using a where-like predicate language.

At runtime, each data item passes through a set of checks that are activated by its context.

Core components include a lightweight engine that schedules and evaluates rules, a library of common checks

Typical use cases include data intake validation in ETL pipelines, API payload validation before processing, and

Origin and reception: Checkswhere originated as an open-source project in the early 2020s and has been adopted

See also: data validation, rule engine, assertion library, ETL, data quality.

A
rule
is
defined
as
a
pair
of
a
predicate
and
one
or
more
checks;
if
the
predicate
evaluates
to
true
for
a
given
item,
its
associated
checks
are
executed.
This
design
enables
separating
business
conditions
from
validation
logic
and
supports
dynamic
rule
sets
that
can
be
modified
without
touching
core
code.
(for
example,
non-null,
type,
range,
pattern
matching,
and
cross-field
validations),
and
a
context
binder
that
attaches
data
items
to
a
working
environment
and
captures
results.
Results
can
be
reported
as
passes
or
failures
with
error
messages
and
metadata.
data
quality
monitoring
where
checks
should
run
only
for
records
meeting
certain
risk
criteria.
by
data
engineering
teams
seeking
modular
validation
with
minimal
boilerplate.
Critics
note
that
complex
rule
sets
can
become
difficult
to
trace
and
that
performance
requires
careful
indexing
of
predicates
in
large
datasets.