Reddit r/MachineLearning·20 May 2026

NOML-NOML: hierarchical TD3 + anchor policy for flight control [P]

Signal

Hype

In three linesCustom RL algorithm NOML for continuous 6-DoF flight control. Combines TD3 with anchor policy (fixed safe action), hierarchical actor (3 independent MLPs pitch→roll→rest), and mirror learning (left-right symmetry). Solves vanilla TD3 oscillation collapse. Open-sourced under Apache 2.0.

Read source

Your take?

Reinforcement learning Code generation Robotics Open source

Summary generated by Claude — human-verified

NOML-NOML: hierarchical TD3 + anchor policy for flight control [P]

Other angles on this story