หน้านี้ตอบอะไร
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi
- Xiaomi · xiaomi/mimo-v2-omni
- text+image+audio+video->text · เส้นทางโมเดลจีน
- 262,144 context · US$0.40 input