Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap
Signal
75
Hype
15
In three linesSix modern tabular foundation models form a highly redundant ensemble (mean Q-statistic 0.961). On 153 OpenML classification tasks, the best ensemble (two-level cascade stacking) gains +0.18% accuracy at 253× compute cost. Friedman-Nemenyi analysis places three ensembles and the best single model in the same equivalence group. Greedy selection is recommended as practical default.Read source
Your take?
Summary generated by Claude — human-verified