mBERT

mBERT, or multilingual BERT, is a pretrained language representation model based on the BERT architecture designed to work across multiple languages with a single model. It was released to enable cross-lingual natural language processing by leveraging a shared multilingual vocabulary and joint pretraining on data from many languages.

Architecture and training: mBERT uses the BERT-base configuration, consisting of 12 transformer layers, a hidden size

Cross-lingual capabilities: A key feature of mBERT is its potential for zero-shot cross-lingual transfer. Because of

Limitations and considerations: While mBERT enables cross-lingual transfer, performance varies by language and script. Low-resource languages

Impact and use: mBERT has become a widely used baseline for multilingual NLP and has influenced subsequent

a

a

a

a

a

a

language-specific

underrepresented

a

a