GlossSyntax
GlossSyntax is a lightweight, human-readable specification and data-interchange format intended for lexical data used in linguistic annotation and natural language processing workflows. It models lexical entries with lemmas, parts of speech, senses and glosses, example sentences, and grammatical features, and supports cross-entry references, inflection data, and language metadata. The format aims to balance readability with machine-processability for dictionary-building and annotation pipelines.
The syntax centers on entry blocks that declare a word and its linguistic information. Each entry includes
An example entry for the English word cat might appear as:
senses: [
{ id: "cat#01", gloss: "a small domesticated carnivorous mammal", examples: ["The cat slept on the mat."], features:
];
inflection: { plural: "cats" }
}
This example illustrates core elements: lemma, part of speech, a sense with gloss and example, and
GlossSyntax was proposed in academic contexts in the mid-2010s as an open standard for compact lexical
See also Lexical markup language, TEI Lexical Entries, and lexical databases.