NOML-NOML: hierarchical TD3 + anchor policy for flight control [P]
Signal
72
Hype
25
In three linesCustom RL algorithm NOML for continuous 6-DoF flight control. Combines TD3 with anchor policy (fixed safe action), hierarchical actor (3 independent MLPs pitch→roll→rest), and mirror learning (left-right symmetry). Solves vanilla TD3 oscillation collapse. Open-sourced under Apache 2.0.Read source
Your take?
Summary generated by Claude — human-verified