PhoB

PhoB is a family of pre-trained language models for Vietnamese developed by VinAI Research. Built on the Transformer encoder architecture, PhoB models aim to provide language-specific representations that improve natural language understanding for Vietnamese compared with multilingual models trained on mixed languages.

The core of the PhoB family is PhoBERT, which includes base and larger configurations. These models are

PhoB models are intended to serve as strong starting points for fine-tuning on downstream Vietnamese NLP tasks.

The project is maintained by VinAI Research and collaborators, with code and pretrained weights widely used

a

a

representations.

a

classification,

domain-specific