Convergence in Gradient Flows
This note studies continuous-time approximations of adaptive optimization schemes. Under smoothness assumptions and bounded preconditioners, we can show monotonic energy decay over a restricted parameter regime.
The key question is when adaptivity changes only constants versus when it changes the qualitative convergence profile.