arXiv cs.LG·27 May 2026

Online Learning on Hidden-Convex Losses via Algorithmic Equivalence: Optimal Regret, Geometric Barrier, and Bandit Feedback

Signal

Hype

In three linesStudy of adversarial online learning on hidden-convex losses (nonconvex losses becoming convex after reparameterization). Authors prove online gradient descent (OGD) achieves optimal Θ(√T) regret, improving prior O(T^2/3) result. They characterize necessary-and-sufficient Hessian compatibility condition and extend analysis to bandit feedback with O(T^3/4) regret.

Read source

Your take?

Papers Reinforcement learning Benchmarks

Summary generated by Claude — human-verified

Online Learning on Hidden-Convex Losses via Algorithmic Equivalence: Optimal Regret, Geometric Barrier, and Bandit Feedback

Other angles on this story