UniParc - Infinite Lexicon - Infinite Lexicon

UniParc

UniParc, short for Universal Protein Archive, is a comprehensive, non-redundant repository of protein sequences maintained by the UniProt Consortium. It aggregates protein sequences from multiple public data sources and stores each unique protein sequence once, providing a complete historical archive of deposited sequences and their provenance. The primary goal is to enable cross-database integration, traceability of sequence changes, and non-redundant data resources for large-scale analyses.

In UniParc, each unique amino acid sequence has a single record. Sequences reported by different sources that

Access and interoperability: UniParc is accessible through the UniProt website and related data services, including FTP.

Relationship to UniProt: UniParc is part of the UniProt ecosystem and underpins other UniProt resources by

History and maintenance: UniParc was developed to consolidate public protein sequences and is regularly updated to

cross-references

a

cross-references

a

UniProtKB/Swiss-Prot

a

provenance-rich