Training Small Language Models with Knowledge Distillation
Official pre-trained models and baselines in
MiniLLM
: Knowledge distillation of LLMs during instruction tuning.
MiniPLM
: Knowledge distillation of LLMs during pre-training.