idcols
Idcols, short for identifier columns, is a term used in data processing to denote the subset of columns that uniquely identify a row within a dataset. In practice, idcols may correspond to a database primary key, a composite key formed from multiple columns, or a set of attributes chosen to serve as stable identifiers across data transformations.
Idcols are used in tasks such as joining, merging, deduplicating, or upserting records, where matching is performed
Implementation varies: In SQL databases, the role of idcols is played by primary keys or unique constraints;
Considerations include changes to the identified columns that can affect data lineage, missing values that complicate
See also: primary key, composite key, surrogate key, deduplication.