märkdata
Märkdata is a term used in Swedish-language data science to refer to labeled data—datasets in which each example is paired with one or more annotations or target labels. Märkdata underpins supervised machine learning by enabling models to learn mappings from inputs to outputs. It encompasses a range of data types, including image datasets with class labels or bounding boxes, text datasets with sentiment or topic labels, audio with transcriptions or labels, and structured tabular data with a designated target column.
Labeling processes are typically performed by human annotators, sometimes assisted by automated or semi-automatic tools. Clear
Quality and bias are central concerns in märkdata. Label quality directly affects model performance and fairness,
Use and evaluation involve splitting märkdata into training, validation, and test sets, and evaluating models with
Relation to other data types: märkdata sits alongside unlabeled data used in unsupervised learning, as well