Good day.
I try to fine-tuning little model meta-llama/Llama-3.2-3B (LoRA) on a small dataset of 50 questions. I started Mar 23, 2025, 9:55:24 AM (job ID - ftjob-90e653a21fd34ab2a2bab6be00557764) and it is still running! (30+ mins)
It is way too long.
I think it is because you place all fine-tuning jobs in one queue.
I think it is wrong approach. You can divide jobs at least on 2 queue: for small models and for big models (by base model sizes in *B).
And to be honest - you can put more GPU on fine-tuning jobs (only for fine-tuning). 1 server that you separete ONLY for fine-tuning jobs for small models can handle all fine-tuning jobs in seconds!
Training little models with GPU that Nebius have MUST be lightning fast - it can be your main advantage.
For us → the faster we train → the faster we do our experiments → the faster we make money
Please authenticate to join the conversation.
In Review
🖋️ Nebius AI Studio
12 months ago

Vladimir Osipov
Get notified by email when there are changes.
In Review
🖋️ Nebius AI Studio
12 months ago

Vladimir Osipov
Get notified by email when there are changes.