Word2VecCBOW - Infinite Lexicon - Infinite Lexicon

Word2VecCBOW

Word2VecCBOW, short for Continuous Bag-of-Words, is a neural-network-based method for learning dense vector representations of words from large text corpora. It is one of the two architectures in the Word2Vec family, the other being Skip-gram.

CBOW trains to predict a target word from its surrounding context words within a fixed window. Given

Training uses either hierarchical softmax or negative sampling for efficiency. In practice, due to the large

Compared with the Skip-gram model, CBOW tends to perform better with larger datasets and frequent words, while

Applications include providing feature representations for downstream NLP tasks, initialization for neural networks, similarity searches, clustering,

Word2VecCBOW was introduced by Mikolov et al. in 2013 as part of the Word2Vec framework; many pretrained

a

representation.

A

low-dimensional,

-

+

≈

a

context-independent

non-contextualized