My research lies at the intersection of optimization and control, two fields that play a vitally important role in cyber-physical systems (CPS) applications but that tend to have very different goals and tools. My research thrust is to leverage the strengths of both fields with the ultimate goal of enabling safe and efficient complex systems. I have participated in several projects that build toward this goal, and brief summaries are provided below. Click on each item for an expanded description. [more]
The next generation of complex engineered systems will see an unprecedented integration of electromechanical components, communication, and embedded computation. Examples include:
- self-driving cars: cameras and other sensors are being developed to enable autonomous driving. Paired with a communication network to allow cars to receive information from control stations or from other cars, this advance promises to eliminate accidents and improve traffic congestion.
- smart grid: when a home equipped with solar panels can act as both an energy consumer and an energy generator depending on time of day, communication and computing enhancements will be needed to ensure efficiency and reliability of the overall system.
There are many other emerging applications as well, such as: net-zero energy buildings, personalized medicine, robotic or remote surgery, sensor networks for monitoring events such as earthquakes and animal migratory patterns, or autonomous robotics for scientific exploration of space, the deep sea, or other harsh environments. These breakthroughs are now a real possibility thanks to the increasing accessibility of cheap and efficient computation, storage, sensing, and wireless communication. One need only to reach for one’s smartphone to find a perfect example of these technological advancements!
Systems that integrate electromechanical components, a communication network, and embedded computation are nowadays called cyber-physical systems (CPS). A major challenge in CPS development is that many areas of engineering are involved: mechanical, electrical, software, communications, networking, controls, machine learning, and computing. There are also many objectives: robustness, resilience, safety, performance, efficiency, scalability, security, and privacy.
Control theory is the natural choice for addressing requirements such as robust performance and safety, because CPS’s are fundamentally control systems: dynamical systems equipped with sensing, actuation, and feedback. In contrast, large-scale optimization is well suited to handle speed and scalability needs, because CPS’s are also distributed computing platforms: large-scale systems with multiple computing nodes, a communication network, and access to vast amounts of data.
In working at the intersection of these fields, control theory must be reexamined in a context where there are multiple decision makers in a data-rich environment. Scalability and modularity are a must. Likewise, large-scale optimization theory must be enhanced to accommodate feedback, dynamics, and an appropriate notion of robustness to uncertainty.
Optimization as robust control
(with B. Recht, A. Packard, and others)
The problem of selecting a suitable algorithm for use in large-scale optimization is currently more of an art than a science. In this work, we show that algorithm selection (and design!) can be cast as a robust control problem. This work unifies many existing results in optimization theory and provides a principled way to analyze or design algorithms based on practical characteristics such as robustness to noise. [more]
Large-scale optimization is the main engine behind machine learning, computer vision, and other data-intensive applications. Because of the high dimensionality of the data and the emphasis on speed, it is typical to use simple first order or proximal point methods. Our key observation was that such iterative algorithms may be viewed as discrete-time controllers, and the function to be optimized may be viewed as an uncertain plant. In essence, large-scale optimization is a Lur’e problem. Using a suitably modified integral quadratic constraint (IQC) theory, our observation has led to the development of an efficient and principled method for algorithm analysis.
Our method enables the next generation of large-scale optimization algorithms by automating their analysis and design. For example, the gradient method is typically slow to converge but is very robust to noise. In contrast, Nesterov’s accelerated method is fast to converge but fragile to noise. Not only can our method make these empirical facts precise, but we can also design new algorithms that fully explore the trade-off between peak performance and noise robustness. We plan on extending our analysis to operator-splitting methods such as ADMM, analyzing variable timestep methods such as conjugate-gradient by using tools from LPV control, and extending our theory to cover average-case performance so randomized algorithms such as stochastic gradient descent can also be analyzed. Our approach is particularly well suited for analyzing interconnections of simpler building blocks, thus paving the way for a systematic approach to the design of high-performance algorithmic pipelines that are provably safe.
Safety validation for interconnected systems
(with C. Meissen, M. Arcak, and A. Packard)
Many control techniques exist for certifying the safety or performance of a well-modeled system, but these techniques become computationally intractable if the system is anything but modest in size. Large systems are often, however, built by interconnecting smaller subsystems. We developed a scalable distributed optimization methodology that can efficiently certify such interconnected systems. [more]
Large-scale dynamical systems are often interconnections of smaller subsystems. While the behavior of each component may be well characterized, it is typically difficult to certify the performance or safety of the interconnected system. Roughly, our approach was to seek Lyapunov-like certificates for the individual components in such a way that, when combined, they certify the interconnected system. Our novel formulation splits the problem into local certifications that can be parallelized, and a global coordination step that is very efficient. We found that the alternating direction method of multipliers (ADMM), a popular large-scale optimization technique, solved the problem very well on randomly generated interconnected systems. We recently extended the theory to handle polynomial dynamics using sum-of-squares (SOS) as well.
The key feature of our work is that it enables a non-scalable control technique for system verification (such as SOS) to be used in a large-scale fashion via a hierarchical approach. SOS is only used locally, while the global optimizer (in our case, ADMM), coordinates the various agents to design a global safety certificate. Moving forward, we plan to augment our dissipation-based framework to handle IQC-based system characterizations. This will allow us to treat more realistic system specifications such as variable time delays and frequency dependent constraints.
Control under information constraints
(with S. Lall, A. Nayyar, and A. Lamperski)
A defining characteristic of large-scale systems is that control decisions must be made in multiple parts of the system, but the decision-makers have access to different information. This could be due to delays in the network or other communication constraints. In this work, we found that certain classes of information-sharing architectures lead to tractable control problems, and we can often solve them explicitly. [more]
Robust and optimal control are well-established tools for industrial controller design. However, no such theory exists when there are multiple decision makers. Certain information architectures are tractable and approximate solutions can be found numerically, yet we lack a deep understanding of the structure of optimal controllers or how to compute them efficiently. In recent work, we solved a fundamental decentralized control problem, which we call the two-player problem, and showed that the optimal controller can be found using state-space techniques reminiscent of the celebrated centralized theory. The solution is elegant, computable, and provides great structural insight. Furthermore, it is the first piece of the solution to a problem that has been open for at least 40 years: a state-space theory for decentralized control. This work received the 2013 O. Hugo Schuck best paper award.
Recent work with collaborators includes generalizations to information structures with multiple agents and time delays. The results were also extended to finding optimally robust decentralized controllers by using a risk-averse cost function. As information constraints are a key feature of large-scale systems, understanding them from a theoretical perspective is essential. Now that we have a better understanding of the role of information structures in optimal control, the next step is to explore how these ideas can be scaled to large systems. In particular, we want to design modular controllers that are locally implementable.
Control of ground-based telescopes
(with: D. MacMynowski, M. West, S. Lall, and others)
Large ground-based telescopes use adaptive optics (AO) technology to mitigate atmospheric distortion. This works surprisingly well but requires 100s or 1000s of sensors and individually actuated mirrors. AO control is computationally demanding and does not scale well to future larger telescopes. We designed algorithms for AO that scale much better than conventional techniques without compromising image quality. [more]
We developed efficient and scalable algorithms for adaptive optics, a technology used in all major ground-based telescopes. Roughly speaking, a deformable mirror changes shape in order to provide real-time corrections to cancel atmospheric distortions. Feedback is provided by a sensor array that measures local gradients of the incoming wavefront. A key feature of our estimation algorithm was the use of multigrid, a scalable technique common in computational fluid dynamics, but used here in a feedback setting. We tested our algorithms at the Palomar Observatory in Southern California. Their 5.1-meter telescope has 241 actuators, 512 sensors, and operates in the 1–2 kHz range. Our estimators had performance indistinguishable from that of Palomar’s default estimator, and benefited from a factor of 17 savings in computation.
The Palomar Observatory recently installed a new 3000-actuator deformable mirror, and we are looking forward to testing our algorithms on this more complex system. Based on our estimates, the computational savings of using our method on this more complex system will be a factor of roughly 220. For future large telescopes, the projected computational savings is even larger. The multigrid-based architecture that we developed for wavefront estimation could in principle be adapted to other large-scale systems such as distributed sensor networks or other systems with a mesh-like topology. The recurring challenge, which ties in with my main research thrust, is to build a bridge between highly scalable optimization algorithms and control systems theory.