Balancing Knowledge Distillation for Imbalance Learning with Bilevel Optimization
Signal
72
Hype
15
In three linesBiKD introduces a bilevel framework to dynamically balance hard and soft losses in knowledge distillation on imbalanced data. A weight generation network produces adaptive per-sample weights guided by a small balanced validation set. Experiments on long-tailed CIFAR-10/100 show improvements over recent balanced distillation methods.Read source
Your take?
Summary generated by Claude — human-verified