Home

recordssaid

Recordssaid is a web-based archive and search platform for spoken language captured in audio recordings. It provides a centralized index of phrases with timestamps and speaker metadata, alongside machine-generated and human-curated transcripts. The project aims to support quote verification, linguistic research, and media accountability by enabling precise retrieval of spoken phrases.

Recordssaid originated in 2015 as a collaboration between linguists and digital archivists within the Global Speech

Content and metadata: The platform aggregates audio from public-domain sources and licensed collections, prioritizing materials with

Features: Users can perform full-text searches across transcripts, retrieve time-aligned segments for individual phrases, view speaker

Usage and impact: Researchers, journalists, educators, and archivists use recordssaid to verify quotes, study speech patterns,

Governance and challenges: As an open platform, it relies on community governance and clear licensing. Challenges

Archives
Initiative.
The
public
beta
launched
in
2016,
and
an
official
release
followed
in
2017.
The
software
is
open
source,
with
core
components
including
the
recordssaid
engine
and
an
API
for
programmatic
access.
clear
licensing.
Each
entry
includes
the
spoken
phrase
text,
the
corresponding
timestamp
or
time
range,
speaker
information
when
identifiable,
recording
date
and
location,
source
collection,
license,
and
a
transcription
quality
flag.
and
context
metadata,
and
contribute
crowdsourced
corrections.
The
platform
supports
export
of
data
and
integration
via
a
REST
API,
and
adopts
interoperable
metadata
schemas.
and
track
the
dissemination
of
phrases
across
media.
The
project
has
influenced
best
practices
for
quote
attribution
and
transparency
in
media
archives.
include
handling
contemporary
content,
privacy
concerns,
and
ensuring
accurate
attribution.