Skip to content

Design a Student Model with <10,000 Parameters for Knowledge Distillation (Accuracy >98%) #3

@dong679

Description

@dong679

Background

To enable efficient deployment, we need to design a student model for knowledge distillation from the current teacher model. The student model must meet the following constraints:

  • Parameter count < 10,000
  • Target accuracy >98% (on the same evaluation metric as the teacher)

Requirements

  • Propose and implement a student model architecture (can be CNN, MLP, LTC, or hybrid) with total parameters strictly less than 10,000.
  • Integrate the full knowledge distillation pipeline: teacher inference, soft target KD loss, optional feature/attention distillation, and hard target loss.
  • Provide parameter count calculation and verification script.
  • Experiment with various student model designs (different channel/hidden sizes, number of layers, etc.) to achieve the best trade-off between size and accuracy.
  • Document all architectural choices, hyperparameters, and training tricks used to reach >98% accuracy.
  • Provide training logs, learning curves, and test accuracy.
  • Discuss limitations, if any, and suggestions for further improvements.

Acceptance Criteria

  • Student model Python code (<10,000 parameters, with calculation)
  • Training script with knowledge distillation (can reuse teacher code as needed)
  • Achieved >98% accuracy on the target dataset
  • Documentation: architecture, parameter count, training details, and results
  • (Optional) Visualization: confusion matrix, feature t-SNE, etc.

Notes

  • If necessary, use Teacher Assistant (two-stage distillation) or FitNet/intermediate feature distillation for additional performance.
  • Use strong regularization and data augmentation to help small models generalize.
  • If you fail to reach the goal, document best effort and obstacles encountered.

Labels: enhancement, question

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions