Offer separate short-term and long-term support LLM endpoints

I understand that there might be different needs among users of the token factory:

  • build stable applications with reliable LLM performance → requires long-term support (LTS) for the model endpoint (even if the model is somewhat outdated)

  • use the latest and greatest model as soon as it has been released, because it offers the best performance per dollar/euro → only needs short-term support (STS) for the model endpoint

Public model endpoints should clearly state “STS” or “LTS”. This way the users can pick depending on individual needs.

On Nebius’ side this would allow superseding STS model endpoints without further notice as soon as a superior version of that model gets released - ultimately offering a better service and keeping the “zoo” of model endpoints manageable in the long run.

LTS model endpoints on the other hand could have a predetermined EOL date, so developers can plan for necessary model updates.

Please authenticate to join the conversation.

Upvoters
Status

In Review

Board

💡 Feature request

Date

19 days ago

Author

Jan Schaller

Subscribe to post

Get notified by email when there are changes.