Testing robustness against unforeseen adversaries
OpenAI introduces a method to assess neural network classifier robustness against adversarial attacks unseen during training. The UAR (Unforeseen Attack Robustness) metric measures a single model's ability to withstand unanticipated attacks and emphasizes the need for performance evaluation across diverse unforeseen attack scenarios.