A note about this: much of the optimal control theory work surrounding Pontrayagin's Principle are theoretical building blocks whose primary utility is for mathematical analysis.
In practice, they result in boundary-value problems that are impractical to solve for systems of non-trivial size, and are almost never implemented in practice. Occasionally they are used to construct parameterized solutions for extremum control (e.g. NCO tracking) for very small systems, but these tend to be rarer cases. One runs into dimensionality issues very quickly.
In industrial control systems, optimal control models are almost always discretized and the optimization is done on algebraic systems of equations. Linear algebra dominates there.
Yes, that is what I meant [1]. For example, you can use ideas from differential equations/analysis to determine say, the existence and uniqueness of solutions for continuous ODEs. Pontryagin's Principle and the calculus of variations in general gives you theoretical machinery for working with models in analytic form. Some problems such as minimum time optimization are more tractable in continuous time form than in discrete time. You can also in some simple cases derive the set of closed-form optimal solution trajectories (unconstrained case) and analyze that directly.
Once the models are transformed into discrete form for numerical solution, the tools used lie more in the realm of linear algebra (positive definiteness of Hessians, etc.)
Craig Evans (the author) is the most selfless mathematician I’ve ever studied under—-hands down a life-changing teacher. These notes, as are all his teaching notes, are magnificent.
>Those comments explain how to reformulate the
Pontryagin Maximum Principle for abnormal problems.
Please tell? I have yet to see anyone give a satisfactory approach on how to deal with the abnormal case.
For context there are sometimes optimal solutions which are not given by Pontryagin's Maximum Principle (PMP). An analogous situation can occur with Lagrange multipliers. The necessary conditions given by the Lagrange multipliers are not related to the maximization of the object functional. I surely think the situation is worse with the PMP because you are now in a continuous setting. I think [0] offers some good discussion for the abnormal case in Lagrange multipliers.
I would be interested if anyone has made any recent progress in dealing with the abnormal case for the PMP.
[0] Optimality Conditions: Abnormal and Degenerate Problems
By A.V. Arutyunov
Trying to dig into Optimal Control Theory a bit, after realizing that - in many ways - OCT and (certain aspects of) Machine Learning are just opposite sides of the same coin. Reinforcment Learning in particular shares a lot of concepts with OCT.
Ben Recht also has an excellent series of blog posts (very related to this survey on arXiv, but broader) on the intersection between reinforcement learning and optimal control. An index is available here: http://www.argmin.net/2018/06/25/outsider-rl/
Ben Recht also gave a 2 hour tutorial on "Optimization Perspectives on Learning to Control" at ICML on 10 July. It was a great talk, loosely based on his blog posts, and very popular, with every seat filled.
His slides, references and FB livestreamed video, are here:
Depends on what you mean by rigorous and who you are talking to.
Both fields are attempting to solve the same problem: choose the optimal action to take at the current time for a given process. Control theorists normally start out with a model, or a family of potential models that describe the behavior of the process and work from there to determine the optimal action. This is very much an area of applied mathematics and academics take rigorous approaches, but, in industry, many engineers just use a PID or LQR controller and call it a day regardless how applicable they are to the actual system theoretically.
Meanwhile, the reinforcement learning folk typically work on problems where the models are too complicated to work with computationally or often even write down, so a more tractable approach is to learn a model and control policy from data. There's plenty of people who analyze properties of learning algorithms, etc., within this framework, and others who don't really care beyond whether or not the system works.
Control theorists normally start out with a model, or a family of potential models that describe the behavior of the process and work from there to determine the optimal action.
This is the main distinction I've been exposed to, between Optimal Control and Reinforcement Learning. I've heard it summarized as "Optimal Control uses models, Reinforcement Learning tries very hard to stay away from using models". That's probably simplifying things a little bit too much, but it seems like a reasonable starting point to see where the two fields diverge.
Yeah that's the gist of it. There are things like adaptive control where aspects of the model are adjusted on the fly in real-time to improve performance based on data from the system and robust control that tries to account for modeling error.
A note about this: much of the optimal control theory work surrounding Pontrayagin's Principle are theoretical building blocks whose primary utility is for mathematical analysis.
In practice, they result in boundary-value problems that are impractical to solve for systems of non-trivial size, and are almost never implemented in practice. Occasionally they are used to construct parameterized solutions for extremum control (e.g. NCO tracking) for very small systems, but these tend to be rarer cases. One runs into dimensionality issues very quickly.
In industrial control systems, optimal control models are almost always discretized and the optimization is done on algebraic systems of equations. Linear algebra dominates there.
> ... are theoretical building blocks whose primary utility is for mathematical analysis.
When you say 'mathematical analysis' here, is that just a broader category that encompasses e.g. real and complex analysis? Or something else?
Yes, that is what I meant [1]. For example, you can use ideas from differential equations/analysis to determine say, the existence and uniqueness of solutions for continuous ODEs. Pontryagin's Principle and the calculus of variations in general gives you theoretical machinery for working with models in analytic form. Some problems such as minimum time optimization are more tractable in continuous time form than in discrete time. You can also in some simple cases derive the set of closed-form optimal solution trajectories (unconstrained case) and analyze that directly.
Once the models are transformed into discrete form for numerical solution, the tools used lie more in the realm of linear algebra (positive definiteness of Hessians, etc.)
[1] https://en.wikipedia.org/wiki/Mathematical_analysis
Industry usually means "using math to analyze" when they say "mathematical analysis."
Craig Evans (the author) is the most selfless mathematician I’ve ever studied under—-hands down a life-changing teacher. These notes, as are all his teaching notes, are magnificent.
>Those comments explain how to reformulate the Pontryagin Maximum Principle for abnormal problems.
Please tell? I have yet to see anyone give a satisfactory approach on how to deal with the abnormal case.
For context there are sometimes optimal solutions which are not given by Pontryagin's Maximum Principle (PMP). An analogous situation can occur with Lagrange multipliers. The necessary conditions given by the Lagrange multipliers are not related to the maximization of the object functional. I surely think the situation is worse with the PMP because you are now in a continuous setting. I think [0] offers some good discussion for the abnormal case in Lagrange multipliers.
I would be interested if anyone has made any recent progress in dealing with the abnormal case for the PMP.
[0] Optimality Conditions: Abnormal and Degenerate Problems By A.V. Arutyunov
Trying to dig into Optimal Control Theory a bit, after realizing that - in many ways - OCT and (certain aspects of) Machine Learning are just opposite sides of the same coin. Reinforcment Learning in particular shares a lot of concepts with OCT.
for more on that subject - check out this recent RL and OCT survey by Ben Recht, also from UC Berkeley: https://arxiv.org/abs/1806.09460
Ben Recht also has an excellent series of blog posts (very related to this survey on arXiv, but broader) on the intersection between reinforcement learning and optimal control. An index is available here: http://www.argmin.net/2018/06/25/outsider-rl/
I was just reading those last night. Definitely good stuff.
Ben Recht also gave a 2 hour tutorial on "Optimization Perspectives on Learning to Control" at ICML on 10 July. It was a great talk, loosely based on his blog posts, and very popular, with every seat filled.
His slides, references and FB livestreamed video, are here:
https://people.eecs.berkeley.edu/~brecht/l2c-icml2018/
But isn't OCT more rigorous, with proofs/guarantees and such, and ML more experimental?
Depends on what you mean by rigorous and who you are talking to.
Both fields are attempting to solve the same problem: choose the optimal action to take at the current time for a given process. Control theorists normally start out with a model, or a family of potential models that describe the behavior of the process and work from there to determine the optimal action. This is very much an area of applied mathematics and academics take rigorous approaches, but, in industry, many engineers just use a PID or LQR controller and call it a day regardless how applicable they are to the actual system theoretically.
Meanwhile, the reinforcement learning folk typically work on problems where the models are too complicated to work with computationally or often even write down, so a more tractable approach is to learn a model and control policy from data. There's plenty of people who analyze properties of learning algorithms, etc., within this framework, and others who don't really care beyond whether or not the system works.
Control theorists normally start out with a model, or a family of potential models that describe the behavior of the process and work from there to determine the optimal action.
This is the main distinction I've been exposed to, between Optimal Control and Reinforcement Learning. I've heard it summarized as "Optimal Control uses models, Reinforcement Learning tries very hard to stay away from using models". That's probably simplifying things a little bit too much, but it seems like a reasonable starting point to see where the two fields diverge.
Yeah that's the gist of it. There are things like adaptive control where aspects of the model are adjusted on the fly in real-time to improve performance based on data from the system and robust control that tries to account for modeling error.
Reinforcement learning is direct adaptive optimal control
https://ieeexplore.ieee.org/document/126844/?reload=true
Non-paywalled link:
http://www.ieeecss.org/CSM/library/1992/april1992/w01-Reinfo...
Is this a very crude summary of Pontryagin's principle? Basically, you use Lagrange multipliers to solve a constrained optimization.
In contrast, dynamic programming is based on stitching together optimal sub-solutions.