Support Implicite/Explicite Prompt Caching
Many other providers offer prompt-caching with discounted pricing for cache hits, either implicitly (e.g., OpenAI, DeepSeek, Google, DeepInfra, NovitaAI, Fireworks) or explicitly (most notably Anthropic). This capability can significantly reduce costs in agentic workflows, where a single session often re-sends the same context repeatedly (for example, when the model performs multiple tool calls in sequence and the shared conversation/context is included each time). Today, Nebius Token Factory is at a cost disadvantage in these repeated, input-token-heavy scenarios compared to providers that support prompt caching and pass the savings through to customers. Please add support for prompt caching (implicit or explicit), including discounted pricing for cached prompt tokens, to improve cost-efficiency for agentic and tool-using applications.

Lukas Kreussel about 1 month ago
Billing
💡 Feature request
Support Implicite/Explicite Prompt Caching
Many other providers offer prompt-caching with discounted pricing for cache hits, either implicitly (e.g., OpenAI, DeepSeek, Google, DeepInfra, NovitaAI, Fireworks) or explicitly (most notably Anthropic). This capability can significantly reduce costs in agentic workflows, where a single session often re-sends the same context repeatedly (for example, when the model performs multiple tool calls in sequence and the shared conversation/context is included each time). Today, Nebius Token Factory is at a cost disadvantage in these repeated, input-token-heavy scenarios compared to providers that support prompt caching and pass the savings through to customers. Please add support for prompt caching (implicit or explicit), including discounted pricing for cached prompt tokens, to improve cost-efficiency for agentic and tool-using applications.

Lukas Kreussel about 1 month ago
Billing
💡 Feature request
Support Latest Top 5 LLM Models
Out of the top 5 models (based on LLM and artificalanalysis) token factory still lacks: - MiniMax M2.5 - GLM 5 - Qwen3.5-397B-A17B - Step 3.5 Flash Additionally MiMo-V2-Flash would be welcome as Step 3.5 Flash only supports 64k token window. I’m quite confident that providing these models in the token factory catalog would provide users frontier proprietary level models, which would greatly help adoption. Additional note: On the main website (nebius.com) when someone hovers over the token factory menu the models that appear in the popup are all relatively old. K2.5 is already provided in token factory, but only K2 is listed. (1) https://llm-stats.com/leaderboards/open-llm-leaderboard (SWE-bench Verified) (2) https://artificialanalysis.ai/models/open-source (intelligence ranking)

davidhidvegi about 1 month ago
💡 Feature request
Support Latest Top 5 LLM Models
Out of the top 5 models (based on LLM and artificalanalysis) token factory still lacks: - MiniMax M2.5 - GLM 5 - Qwen3.5-397B-A17B - Step 3.5 Flash Additionally MiMo-V2-Flash would be welcome as Step 3.5 Flash only supports 64k token window. I’m quite confident that providing these models in the token factory catalog would provide users frontier proprietary level models, which would greatly help adoption. Additional note: On the main website (nebius.com) when someone hovers over the token factory menu the models that appear in the popup are all relatively old. K2.5 is already provided in token factory, but only K2 is listed. (1) https://llm-stats.com/leaderboards/open-llm-leaderboard (SWE-bench Verified) (2) https://artificialanalysis.ai/models/open-source (intelligence ranking)

davidhidvegi about 1 month ago
💡 Feature request
/v1/responses OpenAI compatible endpoint
The /v1/responses endpoint from OpenAI is used by tools like Kilo Code and Opencode, and it is particularly useful for agent development and handling large files. I noticed that Nebius does not currently support this, but I believe it would be a valuable addition. Maintaining compatibility with the official OpenAI API would enable Nebius to serve as a backend for tools like Opencode and similar applications.

Roberto Sánchez about 2 months ago
💡 Feature request
/v1/responses OpenAI compatible endpoint
The /v1/responses endpoint from OpenAI is used by tools like Kilo Code and Opencode, and it is particularly useful for agent development and handling large files. I noticed that Nebius does not currently support this, but I believe it would be a valuable addition. Maintaining compatibility with the official OpenAI API would enable Nebius to serve as a backend for tools like Opencode and similar applications.

Roberto Sánchez about 2 months ago
💡 Feature request
Larger FAST models with SO.
Currently SO by constrained decoding is supported by large models (>80B) only with tps <50. This is insanely slow for agentic architectures based on SGR, and you keep slowing down inference. Please provide at least one fast, large, good model with SO

Ivan Matveev 2 months ago
💡 Feature request
Larger FAST models with SO.
Currently SO by constrained decoding is supported by large models (>80B) only with tps <50. This is insanely slow for agentic architectures based on SGR, and you keep slowing down inference. Please provide at least one fast, large, good model with SO

Ivan Matveev 2 months ago
💡 Feature request
Support of PostGIS
Hello, I noticed that your PostgresSQL managed service supports many extensions, but PostGIS isn’t listed (https://docs.nebius.com/postgresql/databases/extensions). Would it be possible to add support for PostGIS (https://postgis.net/)? Thanks!

Baptiste Morel-Lab 2 months ago
Managed Database
💡 Feature request
Support of PostGIS
Hello, I noticed that your PostgresSQL managed service supports many extensions, but PostGIS isn’t listed (https://docs.nebius.com/postgresql/databases/extensions). Would it be possible to add support for PostGIS (https://postgis.net/)? Thanks!

Baptiste Morel-Lab 2 months ago
Managed Database
💡 Feature request
Be nimble with new open source models that lead the benchmarks: Support Kimi K2.5 in Token factory
If a new open weights / open source models drops and it leads the benchmarks (look https://artificialanalysis.ai/models in the open section), be nimble and quick to offer it in the token factory

micu voilà 2 months ago
💡 Feature request
Be nimble with new open source models that lead the benchmarks: Support Kimi K2.5 in Token factory
If a new open weights / open source models drops and it leads the benchmarks (look https://artificialanalysis.ai/models in the open section), be nimble and quick to offer it in the token factory

micu voilà 2 months ago
💡 Feature request
Token Factory API: Lowercase model IDs
In the Token Factory API, the model selector should be able to be specified as lowercase. For example: Required today: {"model":"Qwen/Qwen3-Coder-480B-A35B-Instruct", … } Wanted: {"model":"qwen/qwen3-coder-480b-a35b-instruct", … } Why? Currently, OpenCode does not work with any Nebius model, even if OpenCode have a built-in integration with Nebius. Nothing works. Any request says “Not found”, and that is an error from your API because they (mistakenly) force lowercase on all model IDs. If I edit the OpenCode config file and Pascal Case the Nebius model IDs, then your API accepts it. This is also easy to test with curl. This is really an OpenCode issue, but maybe you could also edit your API and allow lowercased IDs, for us developers (customers) sake 🙏 Regards, Christoffer

uninjured2875 3 months ago
Network
💡 Feature request
Token Factory API: Lowercase model IDs
In the Token Factory API, the model selector should be able to be specified as lowercase. For example: Required today: {"model":"Qwen/Qwen3-Coder-480B-A35B-Instruct", … } Wanted: {"model":"qwen/qwen3-coder-480b-a35b-instruct", … } Why? Currently, OpenCode does not work with any Nebius model, even if OpenCode have a built-in integration with Nebius. Nothing works. Any request says “Not found”, and that is an error from your API because they (mistakenly) force lowercase on all model IDs. If I edit the OpenCode config file and Pascal Case the Nebius model IDs, then your API accepts it. This is also easy to test with curl. This is really an OpenCode issue, but maybe you could also edit your API and allow lowercased IDs, for us developers (customers) sake 🙏 Regards, Christoffer

uninjured2875 3 months ago
Network
💡 Feature request
Cloud DNS
Hello! I would like to be able to use Cloud DNS, primarily internal DNS for internal resources and for the team that accesses internal resources via VPN (i.e. there should be a way to specify DNS servers).

Sergei Turguzov 4 months ago
Other
💡 Feature request
Cloud DNS
Hello! I would like to be able to use Cloud DNS, primarily internal DNS for internal resources and for the team that accesses internal resources via VPN (i.e. there should be a way to specify DNS servers).

Sergei Turguzov 4 months ago
Other
💡 Feature request
Standalone applications support in OpenTofu/Terraform
Currently only Applications for Managed Service for Kubernetes has OpenTofu/Terraform support https://docs.nebius.com/terraform-provider/reference/resources/applications_v1alpha1_k8s_release. It would be great to have OpenTofu/Terraform support for standalone applications such as vLLM as well. Thanks! 😃

hongbo-miao 8 months ago
Other
💡 Feature request
Standalone applications support in OpenTofu/Terraform
Currently only Applications for Managed Service for Kubernetes has OpenTofu/Terraform support https://docs.nebius.com/terraform-provider/reference/resources/applications_v1alpha1_k8s_release. It would be great to have OpenTofu/Terraform support for standalone applications such as vLLM as well. Thanks! 😃

hongbo-miao 8 months ago
Other
💡 Feature request
Cilium Gateway API in Managed Kubernetes
Managed Kubernetes is using Cilium currently. I hope it allows us to use Cilium Gateway API. Note based on here: One of the biggest differences between Cilium’s Ingress and Gateway API support and other Ingress controllers is how closely tied the implementation is to the CNI. For Cilium, Ingress and Gateway API are part of the networking stack, and so behave in a different way to other Ingress or Gateway API controllers (even other Ingress or Gateway API controllers running in a Cilium cluster). Other Ingress or Gateway API controllers are generally installed as a Deployment or Daemonset in the cluster, and exposed via a Loadbalancer Service or similar (which Cilium can, of course, enable). Cilium’s Ingress and Gateway API config is exposed with a Loadbalancer or NodePort service, or optionally can be exposed on the Host network also. But in all of these cases, when traffic arrives at the Service’s port, eBPF code intercepts the traffic and transparently forwards it to Envoy (using the TPROXY kernel facility). This affects things like client IP visibility, which works differently for Cilium’s Ingress and Gateway API support to other Ingress controllers. It also allows Cilium’s Network Policy engine to apply CiliumNetworkPolicy to traffic bound for and traffic coming from an Ingress. Nebius support informed me that customizing add-ons is not currently supported, but they plan to enable it in the future. For now, I can modify the cilium-config ConfigMap as described in https://docs.nebius.com/kubernetes/networking/add-ons#cilium and try enabling the Gateway API, with the understanding that these changes might be reverted during Nebius upgrades. It would be great to officially support customization of add-ons expose as a parameter at OpenTofu/Terraform https://docs.nebius.com/terraform-provider/reference/resources/mk8s_v1_cluster so we can have choice and also controls in the code Thanks! 😃

hongbo-miao 8 months ago
Managed Service for Kubernetes®
💡 Feature request
Cilium Gateway API in Managed Kubernetes
Managed Kubernetes is using Cilium currently. I hope it allows us to use Cilium Gateway API. Note based on here: One of the biggest differences between Cilium’s Ingress and Gateway API support and other Ingress controllers is how closely tied the implementation is to the CNI. For Cilium, Ingress and Gateway API are part of the networking stack, and so behave in a different way to other Ingress or Gateway API controllers (even other Ingress or Gateway API controllers running in a Cilium cluster). Other Ingress or Gateway API controllers are generally installed as a Deployment or Daemonset in the cluster, and exposed via a Loadbalancer Service or similar (which Cilium can, of course, enable). Cilium’s Ingress and Gateway API config is exposed with a Loadbalancer or NodePort service, or optionally can be exposed on the Host network also. But in all of these cases, when traffic arrives at the Service’s port, eBPF code intercepts the traffic and transparently forwards it to Envoy (using the TPROXY kernel facility). This affects things like client IP visibility, which works differently for Cilium’s Ingress and Gateway API support to other Ingress controllers. It also allows Cilium’s Network Policy engine to apply CiliumNetworkPolicy to traffic bound for and traffic coming from an Ingress. Nebius support informed me that customizing add-ons is not currently supported, but they plan to enable it in the future. For now, I can modify the cilium-config ConfigMap as described in https://docs.nebius.com/kubernetes/networking/add-ons#cilium and try enabling the Gateway API, with the understanding that these changes might be reverted during Nebius upgrades. It would be great to officially support customization of add-ons expose as a parameter at OpenTofu/Terraform https://docs.nebius.com/terraform-provider/reference/resources/mk8s_v1_cluster so we can have choice and also controls in the code Thanks! 😃

hongbo-miao 8 months ago
Managed Service for Kubernetes®
💡 Feature request
Manage Nebius custom groups via OpenTofu/Terraform
Currently, creating and managing custom groups in Nebius is only possible via the CLI. It would be great to have OpenTofu/Terraform resources to create custom groups, thanks! ☺️

hongbo-miao 8 months ago
IAM
💡 Feature request
Manage Nebius custom groups via OpenTofu/Terraform
Currently, creating and managing custom groups in Nebius is only possible via the CLI. It would be great to have OpenTofu/Terraform resources to create custom groups, thanks! ☺️

hongbo-miao 8 months ago
IAM
💡 Feature request
List Terraform module at OpenTofu Registry
It would be great to list Terraform module at OpenTofu Registry https://search.opentofu.org/ Currently it is hard to for example finding latest version, I have to run VERSIONS_URL=https://storage.eu-north1.nebius.cloud/terraform-provider/provider/nebius/nebius/versions curl -XGET -L -g "$VERSIONS_URL" | jq -r '.versions[].version' | sort -V | tail -1 Thanks! 😃

hongbo-miao 8 months ago
Documentation
💡 Feature request
List Terraform module at OpenTofu Registry
It would be great to list Terraform module at OpenTofu Registry https://search.opentofu.org/ Currently it is hard to for example finding latest version, I have to run VERSIONS_URL=https://storage.eu-north1.nebius.cloud/terraform-provider/provider/nebius/nebius/versions curl -XGET -L -g "$VERSIONS_URL" | jq -r '.versions[].version' | sort -V | tail -1 Thanks! 😃

hongbo-miao 8 months ago
Documentation
💡 Feature request
List Terraform module at Terraform Registry
It would be great to list Terraform module at Terraform Registry https://registry.terraform.io/ Currently it is hard to for example finding latest version, I have to run VERSIONS_URL=https://storage.eu-north1.nebius.cloud/terraform-provider/provider/nebius/nebius/versions curl -XGET -L -g "$VERSIONS_URL" | jq -r '.versions[].version' | sort -V | tail -1 Thanks! 😃

hongbo-miao 8 months ago
Documentation
💡 Feature request
List Terraform module at Terraform Registry
It would be great to list Terraform module at Terraform Registry https://registry.terraform.io/ Currently it is hard to for example finding latest version, I have to run VERSIONS_URL=https://storage.eu-north1.nebius.cloud/terraform-provider/provider/nebius/nebius/versions curl -XGET -L -g "$VERSIONS_URL" | jq -r '.versions[].version' | sort -V | tail -1 Thanks! 😃

hongbo-miao 8 months ago
Documentation
💡 Feature request
Multi-regional projects
Hello! I’d like to see, in the future, the ability to do geo-distribution within a single project, so we can build geo-distributed clusters for mk8s, mpg, and so on. Anything can happen to data centers (in recent cases: a full-day power outage or even a fire), so to prevent business losses we need points of presence in different geolocations.

Sergei Turguzov 8 months ago
Other
💡 Feature request
Multi-regional projects
Hello! I’d like to see, in the future, the ability to do geo-distribution within a single project, so we can build geo-distributed clusters for mk8s, mpg, and so on. Anything can happen to data centers (in recent cases: a full-day power outage or even a fire), so to prevent business losses we need points of presence in different geolocations.

Sergei Turguzov 8 months ago
Other
💡 Feature request