Home

bbox

A bbox, short for bounding box, is a rectangular region that encloses a set of points or an area in a two-dimensional space. It is a basic, widely used construct in fields such as computer vision, image processing, geographic information systems (GIS), and document analysis. A bounding box typically encodes a location and size of a region of interest.

In computer vision, the axis-aligned bounding box (AABB) is the most common variant. It is defined by

In GIS, a bounding box (or extent) describes the geographic territory covering a region, defined by minimum

In document analysis and OCR, bounding boxes delineate regions like words, lines, or figures, enabling segmentation,

Common representations include (xmin, ymin, xmax, ymax) and (left, top, width, height), with conventions varying by

minimum
and
maximum
coordinates
along
each
axis,
often
stored
as
(xmin,
ymin,
xmax,
ymax)
or
as
(left,
top,
width,
height).
Bounding
boxes
are
used
to
localize
objects,
crop
image
regions,
and
serve
as
predictions
in
object
detection
tasks.
Evaluation
frequently
relies
on
the
Intersection
over
Union
(IoU)
metric,
which
compares
predicted
boxes
to
ground-truth
boxes.
Techniques
such
as
non-maximum
suppression
(NMS)
operate
on
sets
of
bounding
boxes
to
reduce
duplicates.
and
maximum
longitudes
and
latitudes
(minx,
miny,
maxx,
maxy).
Bounding
boxes
are
used
for
spatial
queries,
map
rendering
extents,
and
clipping
of
raster
and
vector
data.
They
must
be
interpreted
within
a
coordinate
reference
system.
layout
understanding,
and
downstream
recognition
tasks.
library
or
domain.
Bounding
boxes
provide
a
simple,
portable
abstraction
for
locating
and
manipulating
regions
of
interest.