märkdata

Märkdata is a term used in Swedish-language data science to refer to labeled data—datasets in which each example is paired with one or more annotations or target labels. Märkdata underpins supervised machine learning by enabling models to learn mappings from inputs to outputs. It encompasses a range of data types, including image datasets with class labels or bounding boxes, text datasets with sentiment or topic labels, audio with transcriptions or labels, and structured tabular data with a designated target column.

Labeling processes are typically performed by human annotators, sometimes assisted by automated or semi-automatic tools. Clear

Quality and bias are central concerns in märkdata. Label quality directly affects model performance and fairness,

Use and evaluation involve splitting märkdata into training, validation, and test sets, and evaluating models with

Relation to other data types: märkdata sits alongside unlabeled data used in unsupervised learning, as well

inter-annotator

reproducibility

task-appropriate

reproducibility

transferability

semi-supervised