dokumentnivåvektorer
Dokumentnivåvektorer, often shortened to document vectors, are numerical representations of documents in a multi-dimensional space. These vectors are designed to capture the semantic meaning and thematic content of a document. The process of creating document vectors typically involves analyzing the words and phrases within a document and then mapping them to a vector space. Common techniques for generating document vectors include methods like Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), and more recently, embedding models such as Word2Vec, Doc2Vec (which is a direct extension of Word2Vec for documents), and Universal Sentence Encoder.
The core idea is that documents with similar meanings or topics will have vectors that are close