UDcorpora
UDcorpora is the collection of annotated corpora produced under the Universal Dependencies (UD) project. It provides multilingual corpora annotated with universal part-of-speech tags, lemmas, and dependency relations, all following the UD annotation guidelines. The aim is to enable cross-linguistic comparison and support research in natural language processing, corpus linguistics, and language typology.
Content and format: UDcorpora encompasses a wide range of treebanks across many languages and dialects, including
Access and licensing: UDcorpora is released under an open license and freely accessible through the UD website
Usage and significance: The corpora are used to train and evaluate dependency parsers, perform cross-linguistic analyses,
Relation to UD resources: UDcorpora is integral to the UD project, designed to be compatible with UD