这页解决什么
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interle
- 阿里云 · 通义 · qwen/qwen3-vl-8b-instruct
- text+image->text · 中国模型路线
- 131,072 context · US$0.08 input