PhoGPT 4B Chat

PhoGPT-4B-Chat is VinAI's 3.7B-parameter Vietnamese chat model, fine-tuned from a base trained on 102B Vietnamese tokens. It handles up to 8192-token contexts and was instruction-tuned on 360K conversational examples. BSD-3-Clause licensed, so commercial use is fine.

License: bsd-3-clause·Context: 8,192 tokens

BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 28, 2026

9.3/10

If you need a commercially usable Vietnamese chat model and have modest hardware, PhoGPT-4B-Chat is one of the few purpose-built options at this size. The 102B-token pretraining corpus and 360K fine-tuning examples are solid foundations for a 3.7B model. That said, the low download count means you're mostly on your own if something breaks, and the parameter ceiling will show quickly on anything requiring real reasoning. Hedge — worth a test run for simple Vietnamese chat tasks, but don't expect GPT-4-level coherence.

›Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.25/10. License is explicitly bsd-3-clause on the card and commercial use is correctly flagged. Params (3.7B), context (8192), and vendor (VinAI) all match the card. Minor nit: the row says '360K conversational examples' but the card splits it as 70K instructional + 290K conversations — the sum is correct but the framing is slightly loose. Editorial voice is honest, weaknesses call out low traction and missing GGUF, and the use case (Vietnamese chat/FAQ) is sharp and well-targeted. Solid operator-grade row for a narrow but legitimate niche.

Flags: - Minor: '360K conversational examples' conflates 70K instruction + 290K conversation splits described in the card

Overview

Strengths

Pretrained on 102B Vietnamese tokens — one of the largest Vietnamese-specific training sets publicly documented
Instruction-tuned on 360K examples, giving it reasonable chat behavior out of the box
8192-token context window is generous for a sub-4B model
BSD-3-Clause license allows commercial deployment without royalty concerns

Weaknesses

Vietnamese only — no meaningful capability in other languages
3.7B parameters puts a hard ceiling on complex reasoning and multi-step tasks
Low community traction (1.6K downloads, 43 likes) means limited third-party testing and few reported real-world benchmarks
No publicly available GGUF or quantized builds confirmed — check compatibility with your inference stack before committing