Can you please add Qwen3-235B-A22B-Instruct-2507 256K context

https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507

Qwen3-235B-A22B-Instruct-2507 is the latest flagship Mixture-of-Experts (MoE) large language model from Qwen (Alibaba), released in July 2025. With 235 billion parameters (22B activated per inference), it is engineered for superior performance in instruction following, logical reasoning, mathematics, science, coding, tool usage, and multilingual understanding. The model natively supports a massive 256K (262,144) token context window, making it highly effective for long-context applications and complex tasks.

Key highlights:

Outstanding performance in instruction following, reasoning, comprehension, math, science, programming, and tool use
Substantial gains in multilingual long-tail knowledge coverage
Enhanced alignment with user preferences for subjective and open-ended tasks
Non-thinking mode only (does not generate <think></think> blocks)

Nebius

Can you please add Qwen3-235B-A22B-Instruct-2507 256K context

Suscribirse a la publicación

Suscribirse a la publicación