Can you please add Qwen3-235B-A22B-Instruct-2507 256K context

https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507

Qwen3-235B-A22B-Instruct-2507 is the latest flagship Mixture-of-Experts (MoE) large language model from Qwen (Alibaba), released in July 2025. With 235 billion parameters (22B activated per inference), it is engineered for superior performance in instruction following, logical reasoning, mathematics, science, coding, tool usage, and multilingual understanding. The model natively supports a massive 256K (262,144) token context window, making it highly effective for long-context applications and complex tasks.

Key highlights:

  • Outstanding performance in instruction following, reasoning, comprehension, math, science, programming, and tool use

  • Substantial gains in multilingual long-tail knowledge coverage

  • Enhanced alignment with user preferences for subjective and open-ended tasks

  • Non-thinking mode only (does not generate <think></think> blocks)

Por favor, autentícate para unirte a la conversación.

Votantes
Estado

In Review

Tabla

💡 Feature request

Fecha

Hace 9 meses

Autor

Zbynek

Suscribirse a la publicación

Recibir notificaciones por correo electrónico cuando haya cambios.