You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on fine-tuning language models for a specialized classification task and noticed the project has results for distilling DeepSeek-R1 on Qwen-2.5 base models, but not on specialized variants like Qwen-2.5-coder-instruct.
Questions
1. Specialized model variants for task-specific fine-tuning
For my task which involves code analysis, I'm considering using Qwen-2.5-coder-instruct instead of the standard Qwen-2.5-instruct.
Has anyone tested distillation or fine-tuning with specialized models like Qwen-2.5-coder?
Do specialized base models (like coder variants) show better performance for domain-specific tasks after distillation?
2. LoRA vs full fine-tuning
I see the project uses full fine-tuning for the distilled models:
Has anyone compared LoRA approaches against full fine-tuning in this context?
What performance differences might we expect?
3. System prompts in different training strategies
I've noticed differences in how system prompts are handled:
Why aren't system prompts needed for SFT distillation?
Is this a general principle or specific to this implementation?
The text was updated successfully, but these errors were encountered:
Background
I'm working on fine-tuning language models for a specialized classification task and noticed the project has results for distilling DeepSeek-R1 on Qwen-2.5 base models, but not on specialized variants like Qwen-2.5-coder-instruct.
Questions
1. Specialized model variants for task-specific fine-tuning
For my task which involves code analysis, I'm considering using
Qwen-2.5-coder-instruct
instead of the standardQwen-2.5-instruct
.2. LoRA vs full fine-tuning
I see the project uses full fine-tuning for the distilled models:
3. System prompts in different training strategies
I've noticed differences in how system prompts are handled:
The text was updated successfully, but these errors were encountered: