Qwen3-VL

(qwen.ai)

99 points | by natrys 2 hours ago

10 comments

deepdarkforest 1 hour ago
The Chinese are doing what they have been doing to the manufacturing industry as well. Take the core technology and just optimize, optimize, optimize for 10x the cost/efficiency. As simple as that. Super impressive. These models might be bechmaxxed but as another comment said, i see so many that it might as well be the most impressive benchmaxxing today, if not just a genuinely SOTA open source model. They even released a closed source 1 trillion parameter model today as well that is sitting on no3(!) on lm arena. EVen their 80gb model is 17th, gpt-oss 120b is 52nd https://qwen.ai/blog?id=241398b9cd6353de490b0f82806c7848c5d2...
[-]
- jychang 1 minute ago
  They still suck at explaining which model they serve is which, though.
  They also released today Qwen3-VL Plus [1] today alongside Qwen3-VL 235B [2] and they don't tell us which one is better.
  Also, qwen-plus-2025-09-11 [3] vs qwen3-235b-a22b-instruct-2507 [4]. What's the difference? Who knows.
  [1] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...
  [2] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...
  [3] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...
  [4] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...
be7a 2 hours ago
The biggest takeaway is that they claim SOTA for multi-modal stuff even ahead of proprietary models and still released it as open-weights. My first tests suggest this might actually be true, will continue testing. Wow
[-]
- Computer0 1 hour ago
  I feel like most Open Source releases regardless of size claim to be similar in output quality to SOTA closed source stuff.
- ACCount37 1 hour ago
  Most multi-modal input implementations suck, and a lot of them suck big time.
  Doesn't seem to be far ahead of existing proprietary implementations. But it's still good that someone's willing to push that far and release the results. Getting multimodal input to work even this well is not at all easy.
helloericsf 25 minutes ago
If you're in SF, you don't want to miss this. The Qwen team is making their first public appearance in the United States, with the VP of Qwen Lab speaking at the meetup below during SF teach week. https://partiful.com/e/P7E418jd6Ti6hA40H6Qm Rare opportunity to directly engage with the Qwen team members.
BUFU 1 hour ago
The open source models are no longer catching up. They are leading now.
causal 2 hours ago
That has got to be the most benchmarks I've ever seen posted with an announcement. Kudos for not just cherrypicking a favorable set.
[-]
- esafak 1 hour ago
  We should stop reporting saturated benchmarks.
sergiotapia 1 hour ago
Thank you Qwen team for your generosity. I'm already using their thinking model to build some cool workflows that help boring tasks within my org.
https://openrouter.ai/qwen/qwen3-235b-a22b-thinking-2507
Now with this I will use it to identify and caption meal pictures and user pictures for other workflows. Very cool!
natrys 2 hours ago
Models:
- https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Thinking
- https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct
drapado 1 hour ago
Cool! Pity they are not releasing a smaller A3B MoE model
[-]
- daemonologist 1 hour ago
  Their A3B Omni paper mentions that the Omni at that size outperformed the (unreleased I guess) VL. Edit: I see now that there is no Omni-235B-A22B; disregard the following. ~~Which is interesting - I'd have expected the larger model to have more weights to "waste" on additional modalities and thus for the opposite to be true (or for the VL to outperform in both cases, or for both to benefit from knowledge transfer).~~
  Relevant comparison is on page 15: https://arxiv.org/abs/2509.17765
willahmad 1 hour ago
China is winning the hearts of developers in this race so far. At least, they won mine already.
[-]
- swyx 56 minutes ago
  so.. why do you think they are trying this hard to win your heart?
  [-]
  - michaelt 3 minutes ago
    I can see how it would be in China's interest to make sure there was an LLM that produced cutting edge performance in Chinese-language conversations.
    And some uses of LLMs are intensely political; think of a student using an LLM to learn about the causes of the civil war. I can understand a country wanting their own LLMs for the same reason they write their own history textbooks.
    By releasing the weights they they can get free volunteer help, win hearts and minds with their open approach, weaken foreign corporations, give their citizens robust performance in their native language, and exercise narrative control - all at the same time.
  - willahmad 33 minutes ago
    They might have dozens of reasons, but they already did what they did.
    Some of the reasons could be:
    - mitigation of US AI supremacy
    - Commodify AI use to push forward innovation and sell platforms to run them, e.g. if iPhone wins local intelligence, it benefits China, because China is manufacturing those phones
    - talent war inside China
    - soften the sentiment against China in the US
    - they're just awesome people
    - and many more
  - llllm 42 minutes ago
    they aren’t even trying hard, it’s just that no one else is trying
  - brokencode 25 minutes ago
    Maybe they just want to see one of the biggest stock bubble pops of all time in the US.
jadbox 55 minutes ago
How does it compare to Omni?