orcid.org/0000-0001-5389-9361.
For citations, see Google Scholar.
I also have profiles on LinkedIn, ResearchGate, and Mendeley, but these profiles are not frequently updated. All the references below are also available as a single BibTeX file. [Sort by year] [Sort by topic] [Sort by type]
Forthcoming
- Near-optimal constrained feedback control of nonlinear systems via approximate HJB and control barrier functions (arXiv)
M.A. Shahraki and L. Lessard. Preprint.abstract
This paper presents a two-stage framework for constrained near-optimal feedback control of input-affine nonlinear systems. An approximate value function for the unconstrained control problem is computed offline by solving the Hamilton–Jacobi–Bellman equation. Online, a quadratic program is solved that minimizes the associated approximate Hamiltonian subject to safety constraints imposed via control barrier functions. Our proposed architecture decouples performance from constraint enforcement, allowing constraints to be modified online without recomputing the value function. Validation on a linear 2-state 1D hovercraft and a nonlinear 9-state spacecraft attitude control problem demonstrates near-optimal performance relative to open-loop optimal control benchmarks and superior performance compared to control Lyapunov function-based controllers. - Algebraic characterization of equivalence between optimization algorithms (arXiv)
L. Lessard and M. Udell. Preprint.abstract
When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize equivalence between algorithms that use different oracles that are related via a linear fractional transformation. Prominent examples include linear transformations and function conjugation. - Stochastic LQR design with disturbance preview (arXiv)
J. Liu, L. Lessard, and P. Seiler. Preprint.abstract
This paper considers the discrete-time, stochastic LQR problem with p steps of disturbance preview information where p is finite. We first derive the solution for this problem on a finite horizon with linear, time-varying dynamics and time-varying costs. Next, we derive the solution on the infinite horizon with linear, time-invariant dynamics and time-invariant costs. Our proofs rely on the well-known principle of optimality. We provide an independent proof for the principle of optimality that relies only on nested information structure. Finally, we show that the finite preview controller converges to the optimal noncausal controller as the preview horizon p tends to infinity. We also provide a simple example to illustrate both the finite and infinite horizon results. - The speed-robustness trade-off for first-order methods with additive gradient noise (arXiv)
B. Van Scoy and L. Lessard. Preprint.abstract
We study the trade-off between convergence rate and sensitivity to stochastic additive gradient noise for first-order optimization methods. Ordinary Gradient Descent (GD) can be made fast-and-sensitive or slow-and-robust by increasing or decreasing the stepsize, respectively. However, it is not clear how such a trade-off can be navigated when working with accelerated methods such as Polyak's Heavy Ball (HB) or Nesterov's Fast Gradient (FG) methods, or whether any of these methods can achieve an optimal trade-off. We consider three classes of functions: (1) strongly convex quadratics, (2) smooth strongly convex functions, and (3) nonconvex functions that satisfy a weak notion of strong convexity. For each function class, we present a tractable way to compute convergence rate and sensitivity to additive gradient noise for a broad family of first-order methods, and we present near-Pareto-optimal algorithm designs. Each design consists of a simple analytic update rule with two states of memory, similar to HB and FG. Moreover, each design has a scalar tuning parameter that explicitly trades off convergence rate and sensitivity to additive gradient noise. When tuned as aggressively as possible, our proposed algorithms recover the algorithms with fastest-known convergence rates for each function class. When tuned to be more robust, our algorithms are novel and provide a practical way to control noise sensitivity while maintaining the fastest possible convergence rate. We validate the performance and near-optimality of our designs through numerous numerical simulations.
2025
- State estimation for linear systems with non-Gaussian measurement noise via dynamic programming (doi, arXiv)
M.H. Yoosefian Nooshabadi and L. Lessard. IEEE Conference on Decision and Control (CDC), 5612–5617, Dec 2025. (Rio de Janeiro, Brazil)abstract
We propose a new recursive estimator for linear dynamical systems under Gaussian process noise and non-Gaussian measurement noise. Specifically, we develop an approximate maximum a posteriori (MAP) estimator using dynamic programming and tools from convex analysis. Our approach does not rely on restrictive noise assumptions and employs a Bellman-like update instead of a Bayesian update. Our proposed estimator is computationally efficient, with only modest overhead compared to a standard Kalman filter. Simulations demonstrate that our estimator achieves lower root mean squared error (RMSE) than the Kalman filter and has comparable performance to state-of-the-art estimators, while requiring significantly less computational power. - Adaptive acceleration without strong convexity priors or restarts (url, arXiv)
J.V. Cavalcanti, L. Lessard, and A.C. Wilson. OPT2025 workshop at the Conference on Neural Information Processing Systems (NeurIPS), Dec 2025. (San Diego, CA)abstract
In this paper, we propose a parameter-free algorithm for smooth and strongly convex objective problems called NAG-free. To our knowledge, NAG-free is the first adaptive algorithm capable of directly estimating the strong convexity parameter without priors or resorting to restart schemes. We prove that NAG-free converges globally at least as fast gradient descent, and achieves accelerated convergence locally around the minimum if the Hessian is locally smooth at the minimum and other mild additional assumptions hold. We present real-world experiments in which NAG-free is competitive with restart schemes and adapts to better local curvature conditions. - Inclusion of concentrated solar thermal power in Northeastern University's mechanical engineering curriculum (doi)
B. Lynch, U. Coskun, G. Kowalski, L. Lessard, Y. Levendis, B. Maheswaran, H. Metghalchi, H. Noorian, R. Sipahi, Y. Tjiptowidjojo, and Y. Yazicioglu. ASME Open Journal of Engineering, 4, Sep 2025.abstract
Global energy consumption continues to surge, demanding a transition from fossil fuels to cleaner and more sustainable alternatives. A variety of renewable energy sources—solar, wind, hydro, and geothermal—are critical to this transformation, with each offering diverse and regionally adaptive solutions. Among these sources, solar energy has become a dominant force through both photovoltaic (PV) and solar thermal technologies. While PV systems remain the leading force in regard to rapid deployment and decentralized applications, concentrated solar thermal power (CSTP) systems offer a unique advantage of thermal energy storage. Thermal energy storage offers an affordable and efficient form of dispatchable electricity generation and industrial process heat. Despite its benefits, CSTP remains a niche and is vastly underrepresented in engineering curricula across the United States. This article presents a comprehensive initiative at Northeastern University to address this educational gap by systematically institutionalizing CSTP content across nine mechanical engineering courses from the first year through the graduate level. Through hands-on projects, advanced simulations, and heliostat-focused design challenges, engineering students gain practical and theoretical exposure to CSTP technologies. By aligning curriculum development with the goals of the Department of Energy (DOE) and Heliostat Consortium (HelioCon), Northeastern University establishes a replicable model for integrating CSTP education and preparing a new generation of engineers to meet the growing demands of the clean energy transition. - Spacecraft attitude control under reaction wheel constraints using control Lyapunov and control barrier functions
(doi, arXiv)
M.A. Shahraki and L. Lessard. American Control Conference (ACC), 940–945, Jul 2025. (Denver, CO)abstract
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control. - The fastest known first-order method for minimizing twice continuously differentiable smooth strongly convex functions (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Control Systems Letters, 9, 655–660, Jun 2025.abstract
We consider iterative gradient-based optimization algorithms applied to functions that are smooth and strongly convex. The fastest globally convergent algorithm for this class of functions is the Triple Momentum (TM) method. We show that if the objective function is also twice continuously differentiable, a new, faster algorithm emerges, which we call C²-Momentum (C2M). We prove that C2M is globally convergent and that its worst-case convergence rate is strictly faster than that of TM, with no additional computational cost. We validate our theoretical findings with numerical examples, demonstrating that C2M outperforms TM when the objective function is twice continuously differentiable. - Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning (url, arXiv)
A. Bose, L. Lessard, M. Fazel, and K. Dvijotham. International Conference on Artificial Intelligence and Statistics (AISTATS), May 2025. (Mai Khao, Thailand)abstract
The rise of foundation models fine-tuned on human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. Existing research on provable certified robustness against data poisoning attacks primarily focuses on certifying robustness for static adversaries who modify a fraction of the dataset used to train the model before the training algorithm is applied. In practice, particularly when learning from human feedback in an online sense, adversaries can observe and react to the learning process and inject poisoned samples that optimize adversarial objectives better than when they are restricted to poisoning a static dataset once, before the learning algorithm is applied. Indeed, it has been shown in prior work that online dynamic adversaries can be significantly more powerful than static ones. We present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean estimation problem and binary classification problems and outline directions for extending this in further work. The code to implement our certificates and replicate our results is available at https://github.com/Avinandan22/Certified-Robustness - Adaptive backtracking line search (url, arXiv)
J.V. Cavalcanti, L. Lessard, and A.C. Wilson. International Conference on Learning Representations (ICLR), Apr 2025. (Singapore)abstract
Backtracking line search is foundational in numerical optimization. The basic idea is to adjust the step size of an algorithm by a constant factor until some chosen criterion (e.g. Armijo, Goldstein, Descent Lemma) is satisfied. We propose a new way for adjusting step sizes, replacing the constant factor used in regular backtracking with one that takes into account the degree to which the chosen criterion is violated, without additional computational burden. For convex problems, we prove adaptive backtracking requires fewer adjustments to produce a feasible step size than regular backtracking does for two popular line search criteria: the Armijo condition and the descent lemma. For nonconvex smooth problems, we additionally prove adaptive backtracking enjoys the same guarantees of regular backtracking. Finally, we perform a variety of experiments on over fifteen real world datasets, all of which confirm that adaptive backtracking often leads to significantly faster optimization.
2024
- Model predictive planning: Trajectory planning in obstruction-dense environments for low-agility aircraft (doi, arXiv)
M.T. Wallace, B. Streetman, and L. Lessard. IEEE Conference on Decision and Control (CDC), 8288–8293, Dec 2024. (Milan, Italy)abstract
We present Model Predictive Planning (MPP), a trajectory planner for low-agility vehicles such as a fixed-wing aircraft to navigate obstacle-laden environments. MPP consists of (1) a multi-path planning procedure that identifies candidate paths, (2) a raytracing procedure that generates linear constraints around these paths to enforce obstacle avoidance, and (3) a convex quadratic program that finds a feasible trajectory within these constraints if one exists. Low-agility aircraft cannot track arbitrary paths, so refining a given path into a trajectory that respects the vehicle's limited maneuverability and avoids obstacles often leads to an infeasible optimization problem. The critical feature of MPP is that it efficiently considers multiple candidate paths during the refinement process, thereby greatly increasing the chance of finding a feasible and trackable trajectory. We demonstrate the effectiveness of MPP on a longitudinal aircraft model. - Stealthy optimal range-sensor placement for target localization (doi, arXiv)
M.H. Yoosefian Nooshabadi, R. Sipahi, and L. Lessard. IEEE Control Systems Letters, 8, 2763–2768, Dec 2024.abstract
We study a stealthy range-sensor placement problem where a set of range sensors are to be placed with respect to targets to effectively localize them while maintaining a degree of stealthiness from the targets. This is an open and challenging problem since two competing objectives must be balanced: (a) optimally placing the sensors to maximize their ability to localize the targets and (b) minimizing the information the targets gather regarding the sensors. We provide analytical solutions in 2D for the case of any number of sensors that localize two targets. - Optimal control of multi-agent systems with processing delays (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Transactions on Automatic Control, 69(8), 5141–5153, Aug 2024.abstract
In this article, we consider a cooperative control problem involving dynamically decoupled linear plants. The (output-feedback) controllers for each plant communicate with each other according to a fixed and known network topology, and each transmission incurs a fixed continuous-time processing delay. We provide an explicit closed-form expression for the optimal decentralized controller and its associated cost under these communication constraints and standard linear quadratic Gaussian (LQG) assumptions for the plants and cost function. We find the exact solution without discretizing or otherwise approximating the delays. We also present an implementation of each sub-controller that is efficiently computable, and is composed of standard finite-dimensional linear time-invariant (LTI) and finite impulse response (FIR) components, and has an intuitive observer-regulator architecture reminiscent of the classical separation principle. - Certifying robustness to adaptive data poisoning (url)
A. Bose, M. Udell, L. Lessard, M. Fazel, and K. Dvijotham. FoRLaC workshop at the International Conference on Machine Learning (ICML), Jul 2024. (Vienna, Austria)abstract
The rise of foundational models fine-tuned with human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. While existing research focuses on certifying robustness for static adversaries acting on offline datasets, dynamic attack algorithms have shown to be more effective. Relevant for models with periodic updates where an adversary can adapt based on the algorithm's behavior, such as those in RLHF, we present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean-estimation problem. - Information-theoretic multi-time-scale partially observable systems with inspiration from leukemia treatment (doi, arXiv)
M.P. Chapman, E. Jensen, S.M. Chan, and L. Lessard. Automatica, 163, 111546, May 2024.abstract
We study a partially observable nonlinear stochastic system with unknown parameters, where the given time scales of the states and measurements may be distinct. The proposed setting is inspired by disease management, particularly leukemia. - Orthonormal polynomial bases for airfoil design (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. IEEE Aerospace Conference (AeroConf), 1–9, Mar 2024. (Big Sky, MT)abstract
Shape parameterization plays an important role in the process of Aerodynamic Shape Optimization (ASO). A good parameterization allows the optimizer to easily explore the design space at multiple scales, while not adversely affecting the convergence or accuracy of the solution. Over the last several decades, a wide variety of methods have been proposed and characterized for representing aerodynamic shapes, including various polynomial representations, classes of splines, and radial basis functions. Our research into the application of convex optimization techniques to aerospace design has motivated us to revisit and expand these techniques, particularly for those with convex representations. For such parameterizations, when coupled with convex objective functions and constraints, convergence is guaranteed to globally optimal solutions in polynomial time. This guarantee allows for comparison of the results obtained with different shape representations without potentially adverse complications related to selection of initial conditions, local optima, or noisy gradients. In the course of this work we make several contributions. First, we provide orthonormal extensions to several common polynomial bases. Second, we demonstrate efficient objective function representation for supersonic drag minimization problems when employing an integral form of such orthonormal bases. Finally, we develop an analytic solution to a sample benchmark problem and compare the solutions offered by a finite linear combination of design variables with basis functions, to the underlying continuous result.
2023
- Automated Lyapunov analysis of primal-dual optimization algorithms: An interpolation approach (doi, arXiv)
B. Van Scoy, J.W. Simpson-Porco, and L. Lessard. IEEE Conference on Decision and Control (CDC), 1306–1311, Dec 2023. (Singapore)abstract
Primal-dual algorithms are frequently used for iteratively solving large-scale convex optimization problems. The analysis of such algorithms is usually done on a case-by-case basis, and the resulting guaranteed rates of convergence can be conservative. Here we consider a class of first-order algorithms for linearly constrained convex optimization problems, and provide a linear matrix inequality (LMI) analysis framework for certifying worst-case exponential convergence rates. Our approach builds on recent results for interpolation of convex functions and linear operators, and our LMI directly constructs a Lyapunov function certifying the guaranteed convergence rate. By comparing to rates established in the literature, we show that our approach can certify significantly faster convergence for this family of algorithms. - A tutorial on a Lyapunov-based approach to the analysis of iterative optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3003–3008, Dec 2023. (Singapore)abstract
Iterative gradient-based optimization algorithms are widely used to solve difficult or large-scale optimization problems. There are many algorithms to choose from, such as gradient descent and its accelerated variants such as Polyak's Heavy Ball method or Nesterov's Fast Gradient method. It has long been observed that iterative algorithms can be viewed as dynamical systems, and more recently, as robust controllers. Here, the "uncertainty" in the dynamics is the gradient of the function being optimized. Therefore, worst-case or average-case performance can be analyzed using tools from robust control theory, such as integral quadratic constraints (IQCs). In this tutorial paper, we show how such an analysis can be carried out using an alternative Lyapunov-based approach. This approach recovers the same performance bounds as with IQCs, but with the added benefit of constructing a Lyapunov function. - A tutorial on the structure of distributed optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3009–3014, Dec 2023. (Singapore)abstract
We consider the distributed optimization problem for a multi-agent system. Here, multiple agents cooperatively optimize an objective by sharing information through a communication network and performing computations. In this tutorial, we provide an overview of the problem, describe the structure of its algorithms, and use simulations to illustrate some algorithmic properties based on this structure. - Generalized necessary and sufficient robust boundedness results for feedback systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Transactions on Automatic Control, 68(9), 5693–5698, Sep 2023.abstract
Classical conditions for ensuring the robust stability of a system in feedback with a nonlinearity include passivity, small gain, circle, and conicity theorems. We present a generalized and unified version of these results in an arbitrary semi-inner product space, which avoids many of the technicalities that arise when working in traditional extended spaces. Our general formulation clarifies when the sufficient conditions for robust stability are also necessary, and we show how to construct worst-case scenarios when the sufficient conditions fail to hold. Finally, we show how our general result can be specialized to recover a wide variety of existing results, and explain how properties such as boundedness, causality, linearity, and time-invariance emerge as a natural consequence. - Shape optimization of hypersonic lifting vehicles via convex relaxation (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA Aviation Forum (AVIATION), 1–12, Jun 2023. (San Diego, CA)abstract
In this paper, we extend our previous approach to the conceptual design of hypersonic aerospace vehicles, using convex optimization, to include lift as a performance indicator and constraint. Our approach employs the transformation of the lift coefficient formula to a concave function, by change of variables. This results in design problems featuring a convex objection function with a single nonconvex constraint, which we address using convex relaxation. Doing so allows us to approach a richer space of design problems which now include minimum lift constraints and treatment of lift to drag ratio within the objective function. We also apply numerical integration directly to the continuous differentials of both lift and drag, versus the panelized approach of our previous work. We demonstrate this approach on problems incorporating lift and drag, and also demonstrate a bilevel optimization problem to obtain a globally optimal maximum lift to drag coefficient vehicle. - Guaranteed stability margins for decentralized linear quadratic regulators (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Control Systems Letters, 7, 1778–1782, May 2023.abstract
It is well-known that linear quadratic regulators (LQR) enjoy guaranteed stability margins, whereas linear quadratic Gaussian regulators (LQG) do not. In this letter, we consider systems and compensators defined over directed acyclic graphs. In particular, there are multiple decision-makers, each with access to a different part of the global state. In this setting, the optimal LQR compensator is dynamic and is similar to a classical LQG compensator; its state can be interpreted as a filter that predicts unobserved parts of the plant state. We show that when sub-controller input costs are decoupled (but there is possible coupling between sub-controller state costs), the decentralized LQR compensator enjoys similar guaranteed stability margins to classical LQR. However, these guarantees disappear when cost coupling is introduced. - Convex optimization of low observability hypersonic vehicles (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. IEEE Aerospace Conference (AeroConf), 1–11, Mar 2023. (Big Sky, MT)abstract
In this paper, we propose an approach to the conceptual design of high speed aerospace vehicles that addresses the coupled behavior of hypersonic aerodynamics and radar cross section. Our approach employs convex optimization, a branch of optimization theory that guarantees global optima for problems expressed with convex objective functions and constraints, combined with cubic splines as cross sectional representations. We demonstrate the process of creating convex surrogates using piecewise linear functions and apply these objective functions to useful test cases, employing a mixture of convex constraints on geometry. We also provide comparisons on the ability to converge to global optima between this type of convex optimization problem to a nonconvex, sequential quadratic programming solver. - Convex optimization of thin airfoils using cubic splines (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA SciTech Forum (SCITECH), 1–15, Jan 2023. (National Harbor, MD)abstract
Convex optimization offers an attractive approach to the physical design of aerospace vehicles due to its ability to quickly identify global optima to relatively complex problems. Building on previous work, we show a significant improvement in optimal supersonic drag coefficient for airfoils parameterized by cubic splines, versus cubic polynomials. In particular we derive convex expressions for the supersonic lift and drag coefficients of thin airfoils expressed as cubic splines, as well as subsonic lift and moment coefficients for the same. We compare the results of globally optimal designs parameterized by cubic polynomials and cubic splines, showing improvements in drag performance of approximately 20-30%. These designs are computed typically in single digit seconds using commodity hardware and open source software tools. - Unifying effects of direct and relational associations for visual communication (doi, arXiv)
M.A. Schoenlein, J. Campos, K.J. Lande, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 29(1), 385–395, Jan 2023.abstract
People have expectations about how colors map to concepts in visualizations, and they are better at interpreting visualizations that match their expectations. Traditionally, studies on these expectations (inferred mappings) distinguished distinct factors relevant for visualizations of categorical vs. continuous information. Studies on categorical information focused on direct associations (e.g., mangos are associated with yellows) whereas studies on continuous information focused on relational associations (e.g., darker colors map to larger quantities; dark-is-more bias). We unite these two areas within a single framework of assignment inference. Assignment inference is the process by which people infer mappings between perceptual features and concepts represented in encoding systems. Observers infer globally optimal assignments by maximizing the "merit," or "goodness," of each possible assignment. Previous work on assignment inference focused on visualizations of categorical information. We extend this approach to visualizations of continuous data by (a) broadening the notion of merit to include relational associations and (b) developing a method for combining multiple (sometimes conflicting) sources of merit to predict people's inferred mappings. We developed and tested our model on data from experiments in which participants interpreted colormap data visualizations, representing fictitious data about environmental concepts (sunshine, shade, wild fire, ocean water, glacial ice). We found both direct and relational associations contribute independently to inferred mappings. These results can be used to optimize visualization design to facilitate visual communication.
2022
- Absolute stability via lifting and interpolation (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 6217–6223, Dec 2022. (Cancún, Mexico)abstract
We revisit the classical problem of absolute stability; assessing the robust stability of a given linear time-invariant (LTI) plant in feedback with a nonlinearity belonging to some given function class. Standard results typically take the form of sufficient conditions on the LTI plant, the least conservative of which are based on O'Shea--Zames--Falb multipliers. We present an alternative analysis based on lifting and interpolation that directly constructs a Lyapunov function that certifies absolute stability without resorting to frequency-domain inequalities or integral quadratic constraints. In particular, we use linear matrix inequalities to search over Lyapunov functions that are quadratic in the iterates and linear in the corresponding function values of the system in a lifted space. We show that our approach recovers state-of-the-art results on a set of benchmark problems. - Information structures of the Kalman filter for the elastic wave equation (doi)
J. Arbelaiz, E. Jensen, B. Bamieh, A.E. Hosoi, A. Jadbabaie, and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 1–6, Jul 2022. (Zürich, Switzerland)abstract
We study the Kalman Filter for the linear elastic wave equation over the real line with spatially distributed partial state measurements. The dynamics of the filter are described by a spatial convolution operator with asymptotic exponential spatial decay rate. This decay rate dictates how measurements from different spatial locations must be exchanged to implement the filter: faster spatial decay implies local measurements are more relevant and the filter is more "decentralized"; slower decay implies farther measurements also become relevant and the filter is more "centralized". Using dimensional analysis, we demonstrate that this decay rate is a function of one dimensionless group defined from system parameters, such as wave speed and noise variances. We find a critical value of such dimensionless group for which the Kalman Filter is completely decentralized. - A convex optimization approach to thin airfoil design (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA Aviation Forum (AVIATION), 1–15, Jun 2022. (Chicago, IL)abstract
Convex optimization techniques can find global optima in polynomial time for problems with convex objective functions and constraints. In this paper, we demonstrate that objective functions involving lift, drag, and moment coefficients may be represented as convex functions when modeled using thin-airfoil theory. Additionally, we describe a rich set of constraints that may be formulated using convex representations. Finally, we apply these methods to the conceptual design of airfoils defined by cubic polynomials. In doing so, we demonstrate the ability to rapidly obtain globally-optimal, up to the limits of parameterization and physical model fidelity, airfoil shapes that maximize supersonic lift-to-drag ratio, given constraints on subsonic lift and moment coefficients, as well as various geometric constraints. Furthermore, we show the ability to expand this method to a bi-level optimization problem, with a single nonconvex variable, identifying a global solution for problems involving the placement of an internal payload. These problems, with up to one thousand constraints, typically run in single-digit seconds on commodity hardware. - A universal decomposition for distributed optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Control Systems Letters, 6, 3044–3049, Jun 2022.abstract
In the distributed optimization problem for a multi-agent system, each agent knows a local function and must find a minimizer of the sum of all agents' local functions by performing a combination of local gradient evaluations and communicating information with neighboring agents. We prove that every distributed optimization algorithm can be factored into a centralized optimization method and a second-order consensus estimator, effectively separating the "optimization" and "consensus" tasks. We illustrate this fact by providing the decomposition for many recently proposed distributed optimization algorithms. Conversely, we prove that any optimization method that converges in the centralized setting can be combined with any second-order consensus estimator to form a distributed optimization algorithm that converges in the multi-agent setting. Finally, we describe how our decomposition may lead to a more systematic algorithm design methodology. - The analysis of optimization algorithms: A dissipativity approach (doi, arXiv)
L. Lessard. IEEE Control Systems Magazine, 42(3), 58–72, Jun 2022.abstract
Optimization problems in engineering and applied mathematics are typically solved in an iterative fashion, by systematically adjusting the variables of interest until an adequate solution is found. The iterative algorithms that govern these systematic adjustments can be viewed as a control system. In control systems, the output in measured and the input is adjusted using feedback to drive the error to zero. Similarly, in iterative algorithms, the optimization objective is evaluated and the candidate solution is adjusted to drive it toward the optimal point. Choosing an algorithm that works well for a variety of optimization problems is akin to robust controller design. Just as dissipativity theory can be used to analyze the stability properties of control systems, it can also be used to analyze the convergence properties of iterative algorithms. By defining an appropriate notion of "energy" that dissipates with every iteration of the algorithm, the convergence properties of the algorithm can be characterized. This article formalizes the connection between iterative algorithms and control systems and shows through examples how dissipativity theory can be used to analyze the performance of many classes of optimization algorithms. This control-theoretic viewpoint enables the selection and tuning of optimization algorithms to be performed in an automated and systematic way. - Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization (url, arXiv)
G. Zhang, Y. Wang, L. Lessard, and R. Grosse. International Conference on Artificial Intelligence and Statistics (AISTATS), 7659–7679, Mar 2022. (Valencia, Spain, virtual)abstract
Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (Sim-GDA) in many settings. We prove that Alt-GDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the first result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate globally for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms. - Context matters: A theory of semantic discriminability for perceptual encoding systems (doi, arXiv)
K. Mukherjee, B. Yin, B.E. Sherman, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 28(1), 697–706, Jan 2022.abstract
People's associations between colors and concepts influence their ability to interpret the meanings of colors in information visualizations. Previous work has suggested such effects are limited to concepts that have strong, specific associations with colors. However, although a concept may not be strongly associated with any colors, its mapping can be disambiguated in the context of other concepts in an encoding system. We articulate this view in semantic discriminability theory, a general framework for understanding conditions determining when people can infer meaning from perceptual features. Semantic discriminability is the degree to which observers can infer a unique mapping between visual features and concepts. Semantic discriminability theory posits that the capacity for semantic discriminability for a set of concepts is constrained by the difference between the feature-concept association distributions across the concepts in the set. We define formal properties of this theory and test its implications in two experiments. The results show that the capacity to produce semantically discriminable colors for sets of concepts was indeed constrained by the statistical distance between color-concept association distributions (Experiment 1). Moreover, people could interpret meanings of colors in bar graphs insofar as the colors were semantically discriminable, even for concepts previously considered "non-colorable" (Experiment 2). The results suggest that colors are more robust for visual communication than previously thought.
2021
- A unified analysis of first-order methods for smooth games via integral quadratic constraints (url, arXiv)
G. Zhang, X. Bao, L. Lessard, and R. Grosse. Journal of Machine Learning Research, 22(103), 1–39, Jun 2021.abstract
The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements. In this work, we adapt the IQC theory to study first-order methods for smooth and strongly-monotone games and show how to design tailored quadratic constraints to get tight upper bounds of convergence rates. Using this framework, we recover the existing bound for the gradient method (GD), derive sharper bounds for the proximal point method (PPM) and optimistic gradient method (OG), and provide for the first time a global convergence rate for the negative momentum method (NM) with an iteration complexity O(kappa^1.5), which matches its known lower bound. In addition, for time-varying systems, we prove that the gradient method with optimal step size achieves the fastest provable worst-case convergence rate with quadratic Lyapunov functions. Finally, we further extend our analysis to stochastic games and study the impact of multiplicative noise on different algorithms. We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch (in contrast with the stochastic strongly-convex optimization setting, where such acceleration has been demonstrated). However, we exhibit an algorithm which achieves acceleration with two gradient queries per batch. - Toward a scalable upper bound for a CVaR-LQ problem (doi, arXiv)
M.P. Chapman and L. Lessard. IEEE Control Systems Letters, 6, 920–925, Jun 2021.abstract
We consider a linear-quadratic optimal control problem in discrete time with distributional ambiguity, where the random cumulative quadratic cost is assessed via the Conditional Value-at-Risk (CVaR) functional. We take steps toward deriving a scalable value iteration algorithm to upper-bound the solution to this problem. A remarkable finding is that the value functions depend on positive definite matrices, which are computed by a Riccati-like recursion. - An automatic system to detect equivalence between algorithms (arXiv)
S. Zhao, L. Lessard, and M. Udell.abstract
When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize some algorithm transformations including permutations of the update equations, repetition of the iteration, and conjugation of some of the function oracles in the algorithm. To support the paper, we have developed a software package named Linnaeus that implements the framework to identify other iterative algorithms that are equivalent to an input algorithm. More broadly, this framework and software advances the goal of making mathematics searchable. - Analysis of biased stochastic gradient descent using sequential semidefinite programs (doi, arXiv)
B. Hu, P. Seiler, and L. Lessard. Mathematical Programming, 187(1), 384–408, May 2021.abstract
We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions. - Semantic discriminability for visual communication (doi, arXiv)
K.B. Schloss, Z. Leggon, and L. Lessard. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1022–1031, Feb 2021.abstract
To interpret information visualizations, observers must determine how visual features map onto concepts. First and foremost, this ability depends on perceptual discriminability; observers must be able to see the difference between different colors for those colors to communicate different meanings. However, the ability to interpret visualizations also depends on semantic discriminability, the degree to which observers can infer a unique mapping between visual features and concepts, based on the visual features and concepts alone (i.e., without help from verbal cues such as legends or labels). Previous evidence suggested that observers were better at interpreting encoding systems that maximized semantic discriminability (maximizing association strength between assigned colors and concepts while minimizing association strength between unassigned colors and concepts), compared to a system that only maximized color-concept association strength. However, increasing semantic discriminability also resulted in increased perceptual distance, so it is unclear which factor was responsible for improved performance. In the present study, we conducted two experiments that tested for independent effects of semantic distance and perceptual distance on semantic discriminability of bar graph data visualizations. Perceptual distance was large enough to ensure colors were more than just noticeably different. We found that increasing semantic distance improved performance, independent of variation in perceptual distance, and when these two factors were uncorrelated, responses were dominated by semantic distance. These results have implications for navigating trade-offs in color palette design optimization for visual communication.
2020
- Systematic analysis of distributed optimization algorithms over jointly-connected networks (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3096–3101, Dec 2020. (Jeju Island, Republic of Korea, virtual)abstract
We consider the distributed optimization problem, where a group of agents work together to optimize a common objective by communicating with neighboring agents and performing local computations. Using tools from robust control, we develop a systematic analysis of a large class of distributed algorithms for solving this problem without using restrictive assumptions on the communication network. In particular, we assume only that the network is jointly connected over a finite time horizon (commonly referred to as B-connectivity), which does not require connectivity at each time instant. When applied to the distributed algorithm DIGing, our bounds are orders of magnitude tighter than those available in the literature. - Agent-level optimal LQG control of dynamically decoupled systems with processing delays (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Conference on Decision and Control (CDC), 5980–5985, Dec 2020. (Jeju Island, Republic of Korea, virtual)abstract
We consider the problem of controlling a set of dynamically decoupled plants where the plants' subcontrollers communicate with each other according to a fixed and known network topology. We assume the communication to be instantaneous but there is a fixed processing delay associated with incoming transmissions. We provide explicit closed-form expressions for the optimal decentralized controller under these communication constraints and using standard LQG assumptions for the plants and cost function. Although this problem is convex, it is challenging due to the irrationality of continuous-time delays and the decentralized information-sharing pattern. We show that the optimal subcontrollers each have an observer-regulator architecture containing LTI and FIR blocks and we characterize the signals that subcontrollers should transmit to each other across the network. - Analysis and design of first-order distributed optimization algorithms over time-varying graphs (doi, arXiv)
A. Sundararajan, B. Van Scoy, and L. Lessard. IEEE Transactions on Control of Network Systems, 7(4), 1597–1608, Dec 2020.abstract
This work concerns the analysis and design of distributed first-order optimization algorithms over time-varying graphs. The goal of such algorithms is to optimize a global function that is the average of local functions using only local computations and communications. Several different algorithms have been proposed that achieve linear convergence to the global optimum when the local functions are strongly convex. We provide a unified analysis that yields a worst-case linear convergence rate as a function of the condition number of the local functions, the spectral gap of the graph, and the parameters of the algorithm. The framework requires solving a small semidefinite program whose size is fixed; it does not depend on the number of local functions or the dimension of the domain. The result is a computationally efficient method for distributed algorithm analysis that enables the rapid comparison, selection, and tuning of algorithms. Finally, we propose a new algorithm, which we call SVL, that is easily implementable and achieves the fastest possible worst-case convergence rate among all algorithms in the family we considered. We support our theoretical analysis with numerical experiments that generate worst-case examples demonstrating the effectiveness of SVL. - The relation between color and spatial structure for interpreting colormap data visualizations (doi)
S.C. Sibrel, R. Rathore, L. Lessard, and K.B. Schloss. Journal of Vision, 20(12), 7, Nov 2020.abstract
Interpreting colormap visualizations requires determining how dimensions of color in visualizations map onto quantities in data. People have color-based biases that influence their interpretations of colormaps, such as a dark-is-more bias---darker colors map to larger quantities. Previous studies of color-based biases focused on colormaps with weak data spatial structure, but color-based biases may not generalize to colormaps with strong data spatial structure, like "hotspots" typically found in weather maps and neuroimaging brain maps. There may be a hotspot-is-more bias to infer that colors within hotspots represent larger quantities, which may override the dark-is-more bias. We tested this possibility in four experiments. Participants saw colormaps with hotspots and a legend that specified the color-quantity mapping. Their task was to indicate which side of the colormap depicted larger quantities (left/right). We varied whether the legend specified dark-more mapping or light-more mapping across trials and operationalized a dark-is-more bias as faster response time (RT) when the legend specified dark-more mapping. Experiment 1 demonstrated robust evidence for the dark-is-more bias, without evidence for a hotspot-is-more bias. Experiments 2 to 4 suggest that a hotspot-is-more bias becomes relevant when hotspots are a statistically reliable cue to "more" (i.e., the locus of larger quantities) and when hotspots are more perceptually pronounced. Yet, comparing conditions in which the hotspots were "more," RTs were always faster for dark hotspots than light hotspots. Thus, in the presence of strong spatial cues to the locus of larger quantities, color-based biases still influenced interpretations of colormap data visualizations. - Direct synthesis of iterative algorithms with bounds on achievable worst-case convergence rate (doi, arXiv)
L. Lessard and P. Seiler. American Control Conference (ACC), 119–125, Jul 2020. (Denver, CO, virtual)abstract
Iterative first-order methods such as gradient descent and its variants are widely used for solving optimization and machine learning problems. There has been recent interest in analytic or numerically efficient methods for computing worst-case performance bounds for such algorithms, for example over the class of strongly convex loss functions. A popular approach is to assume the algorithm has a fixed size (fixed dimension, or memory) and that its structure is parameterized by one or two hyperparameters, for example a learning rate and a momentum parameter. Then, a Lyapunov function is sought to certify robust stability and subsequent optimization can be performed to find optimal hyperparameter tunings. In the present work, we instead fix the constraints that characterize the loss function and apply techniques from robust control synthesis to directly search over algorithms. This approach yields stronger results than those previously available, since the bounds produced hold over algorithms with an arbitrary, but finite, amount of memory rather than just holding for algorithms with a prescribed structure. - Online data poisoning attacks (url, arXiv)
X. Zhang, X. Zhu, and L. Lessard. Conference on Learning for Dynamics and Control (L4DC), 201–210, Jun 2020. (Berkeley, CA, virtual)abstract
We study data poisoning attacks in the online setting where training items arrive sequentially, and the attacker may perturb the current item to manipulate online learning. Importantly, the attacker has no knowledge of future training items nor the data generating distribution. We formulate online data poisoning attack as a stochastic optimal control problem, and solve it with model predictive control and deep reinforcement learning. We also upper bound the suboptimality suffered by the attacker for not knowing the data generating distribution. Experiments validate our control approach in generating near-optimal attacks on both supervised and unsupervised learning tasks. - Estimating color-concept associations from image statistics (doi, arXiv)
R. Rathore, Z. Leggon, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 26(1), 1226–1235, Jan 2020.abstract
To interpret the meanings of colors in visualizations of categorical information, people must determine how distinct colors correspond to different concepts. This process is easier when assignments between colors and concepts in visualizations match people’s expectations, making color palettes semantically interpretable. Efforts have been underway to optimize color palette design for semantic interpretablity, but this requires having good estimates of human color-concept associations. Obtaining these data from humans is costly, which motivates the need for automated methods. We developed and evaluated a new method for automatically estimating color-concept associations in a way that strongly correlates with human ratings. Building on prior studies using Google Images, our approach operates directly on Google Image search results without the need for humans in the loop. Specifically, we evaluated several methods for extracting raw pixel content of the images in order to best estimate color-concept associations obtained from human ratings. The most effective method extracted colors using a combination of cylindrical sectors and color categories in color space. We demonstrate that our approach can accurately estimate average human color-concept associations for different fruits using only a small set of images. The approach also generalizes moderately well to more complicated recycling-related concepts of objects that can appear in any color.
2019
- Unified necessary and sufficient conditions for the robust stability of interconnected sector-bounded systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Conference on Decision and Control (CDC), 7690–7695, Dec 2019. (Nice, France)abstract
Classical conditions for ensuring the robust stability of a linear system in feedback with a sector-bounded nonlinearity include the circle, small gain, passivity, and conicity theorems. In this work, we present a similar stability condition, but expressed in terms of operators on a semi-inner product space. This increased generality leads to a clean result that can be specialized in a variety of ways. First, we show how to recover both sufficient and necessary-and-sufficient versions of the aforementioned classical results. Second, we show that suitably choosing the semi-inner product space leads to a new necessary and sufficient condition for exponential stability with a given convergence rate. Finally, in the spirit of classical robust stability analysis, we provide linear matrix inequalities that allow for the efficient verification of the conditions of our theorem. - Integral quadratic constraints: Exact convergence rates and worst-case trajectories (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 7677–7682, Dec 2019. (Nice, France)abstract
We consider a linear time-invariant system in discrete time where the state and input signals satisfy a set of integral quadratic constraints (IQCs). Analogous to the autonomous linear systems case, we define a new notion of spectral radius that exactly characterizes stability of this system. In particular, (i) when the spectral radius is less than one, we show that the system is asymptotically stable for all trajectories that satisfy the IQCs, and (ii) when the spectral radius is equal to one, we construct an unstable trajectory that satisfies the IQCs. Furthermore, we connect our new definition of the spectral radius to the existing literature on IQCs. - Explicit agent-level optimal cooperative controllers for dynamically decoupled systems with output feedback (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Conference on Decision and Control (CDC), 8254–8259, Dec 2019. (Nice, France)abstract
We consider a dynamically decoupled network of agents each with a local output-feedback controller. We assume each agent is a node in a directed acyclic graph and the controllers share information along the edges in order to cooperatively optimize a global objective. We develop explicit state-space formulations for the jointly optimal networked controllers that highlight the role of the graph structure. Specifically, we provide generically minimal agent-level implementations of the local controllers along with intuitive interpretations of their states and the information that should be transmitted between them. - A distributed optimization algorithm over time-varying graphs with efficient gradient evaluations (doi, arXiv)
B. Van Scoy and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 357–362, Sep 2019. (Chicago, IL)abstract
We propose an algorithm for distributed optimization over time-varying communication networks. Our algorithm uses an optimized ratio between the number of rounds of communication and gradient evaluations to achieve fast convergence. The iterates converge to the global optimizer at the same rate as centralized gradient descent when measured in terms of the number of gradient evaluations while using the minimum number of communications to do so. Furthermore, the iterates converge at a near-optimal rate when measured in terms of the number of communication rounds. We compare our algorithm with several other known algorithms on a distributed target localization problem. - A canonical form for first-order distributed optimization algorithms (doi, arXiv)
A. Sundararajan, B. Van Scoy, and L. Lessard. American Control Conference (ACC), 4075–4080, Jul 2019. (Philadelphia, PA)abstract
We consider the distributed optimization problem in which a network of agents aims to minimize the average of local functions. To solve this problem, several algorithms have recently been proposed wherein agents perform various combinations of communication with neighbors, local gradient computations, and updates to local state variables. In this paper, we present a canonical form that characterizes any first-order distributed algorithm that can be implemented using a single round of communication and gradient computation per iteration, and where each agent stores up to two state variables. The canonical form features a minimal set of parameters that are both unique and expressive enough to capture any distributed algorithm in this class. The generic nature of our canonical form enables the systematic analysis and design of distributed optimization algorithms. - An optimal control approach to sequential machine teaching (url, arXiv)
L. Lessard, X. Zhang, and X. Zhu. International Conference on Artificial Intelligence and Statistics (AISTATS), 2495–2503, Apr 2019. (Naha, Japan)abstract
Given a sequential learning algorithm and a target model, sequential machine teaching aims to find the shortest training sequence to drive the learning algorithm to the target model. We present the first principled way to find such shortest training sequences. Our key insight is to formulate sequential machine teaching as a time-optimal control problem. This allows us to solve sequential teaching by leveraging key theoretical and computational tools developed over the past 60 years in the optimal control community. Specifically, we study the Pontryagin Maximum Principle, which yields a necessary condition for optimality of a training sequence. We present analytic, structural, and numerical implications of this approach on a case study with a least-squares loss function and gradient descent learner. We compute optimal training sequences for this problem, and although the sequences seem circuitous, we find that they can vastly outperform the best available heuristics for generating training sequences.
2018
- Design considerations for the enhancement of human color vision by breaking binocular redundancy (doi, arXiv)
B. Gundlach, M. Frising, A. Shahsafi, G. Vershbow, C. Wan, J. Salman, B. Rokers, L. Lessard, and M. Kats. Scientific Reports, 8, 11971, Aug 2018.abstract
To see color, the human visual system combines the response of three types of cone cells in the retina—a compressive process that discards a significant amount of spectral information. Here, we present designs based on thin-film optical filters with the goal of enhancing human color vision by breaking its inherent binocular redundancy, providing different spectral content to each eye. We fabricated a set of optical filters that “splits” the response of the short-wavelength cone between the two eyes in individuals with typical trichromatic vision, simulating the presence of approximately four distinct cone types. Such an increase in the number of effective cone types can reduce the prevalence of metamers—pairs of distinct spectra that resolve to the same tristimulus values. This technique may result in an enhancement of spectral perception, with applications ranging from camouflage detection and anti-counterfeiting to new types of artwork and data visualization. - Dissipativity theory for accelerating stochastic variance reduction: A unified analysis of SVRG and Katyusha using semidefinite programs (url, arXiv)
B. Hu, S. Wright, and L. Lessard. International Conference on Machine Learning (ICML), 2038–2047, Jul 2018. (Stockholm, Sweden)abstract
Techniques for reducing the variance of gradient estimates used in stochastic programming algorithms for convex finite-sum problems have received a great deal of attention in recent years. By leveraging dissipativity theory from control, we provide a new perspective on two important variance-reduction algorithms: SVRG and its direct accelerated variant Katyusha. Our perspective provides a physically intuitive understanding of the behavior of SVRG-like methods via a principle of energy conservation. The tools discussed here allow us to automate the convergence analysis of SVRG-like methods by capturing their essential properties in small semidefinite programs amenable to standard analysis and computational techniques. Our approach recovers existing convergence results for SVRG and Katyusha and generalizes the theory to alternative parameter choices. We also discuss how our approach complements the linear coupling technique. Our combination of perspectives leads to a better understanding of accelerated variance-reduced stochastic methods for finite-sum problems. - Lyapunov functions for first-order methods: Tight automated convergence guarantees (url, arXiv)
A. Taylor, B. Van Scoy, and L. Lessard. International Conference on Machine Learning (ICML), 4897–4906, Jul 2018. (Stockholm, Sweden)abstract
We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation problems (PEP, due to Drori & Teboulle (2014)) and integral quadratic constraints (IQC, due to Lessard et al. (2016)), and relies on convex interpolation (due to Taylor et al. (2017c;b)). - A robust accelerated optimization algorithm for strongly convex functions (doi, arXiv)
S. Cyrus, B. Hu, B. Van Scoy, and L. Lessard. American Control Conference (ACC), 1376–1381, Jun 2018. (Miwaukee, WI)abstract
This work proposes an accelerated first-order algorithm we call the Robust Momentum Method for optimizing smooth strongly convex functions. The algorithm has a single scalar parameter that can be tuned to trade off robustness to gradient noise versus worst-case convergence rate. At one extreme, the algorithm is faster than Nesterov's Fast Gradient Method by a constant factor but more fragile to noise. At the other extreme, the algorithm reduces to the Gradient Method and is very robust to noise. The algorithm design technique is inspired by methods from classical control theory and the resulting algorithm has a simple analytical form. Algorithm performance is verified on a series of numerical simulations in both noise-free and relative gradient noise cases. - Color inference in visual communication: The meaning of colors in recycling (doi)
K.B. Schloss, L. Lessard, C.S. Walmsley, and K. Foley. Cognitive Research: Principles and Implications, 3(5), 1–17, Feb 2018.abstract
People interpret abstract meanings from colors, which makes color a useful perceptual feature for visual communication. This process is complicated, however, because there is seldom a one-to-one correspondence between colors and meanings. One color can be associated with many different concepts (one-to-many mapping) and many colors can be associated with the same concept (many-to-one mapping). We propose that to interpret color-coding systems, people perform assignment inference to determine how colors map onto concepts. We studied assignment inference in the domain of recycling. Participants saw images of colored but unlabeled bins and were asked to indicate which bins they would use to discard different kinds of recyclables and trash. In Experiment 1, we tested two hypotheses for how people perform assignment inference. The local assignment hypothesis predicts that people simply match objects with their most strongly associated color. The global assignment hypothesis predicts that people also account for the association strengths between all other objects and colors within the scope of the color-coding system. Participants discarded objects in bins that optimized the color-object associations of the entire set, which is consistent with the global assignment hypothesis. This sometimes resulted in discarding objects in bins whose colors were weakly associated with the object, even when there was a stronger associated option available. In Experiment 2, we tested different methods for encoding color-coding systems and found that people were better at assignment inference when color sets simultaneously maximized the association strength between assigned color-object parings while minimizing associations between unassigned pairings. Our study provides an approach for designing intuitive color-coding systems that facilitate communication through visual media such as graphs, maps, signs, and artifacts.
2017
- Robust convergence analysis of distributed optimization algorithms (doi)
A. Sundararajan, B. Hu, and L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 1206–1212, Oct 2017. (Monticello IL)abstract
We present a unified framework for analyzing the convergence of distributed optimization algorithms by formulating a semidefinite program (SDP) which can be efficiently solved to bound the linear rate of convergence. Two different SDP formulations are considered. First, we formulate an SDP that depends explicitly on the gossip matrix of the network graph. This result provides bounds that depend explicitly on the graph topology, but the SDP dimension scales with the size of the graph. Second, we formulate an SDP that depends implicitly on the gossip matrix via its spectral gap. This result provides coarser bounds, but yields a small SDP that is independent of graph size. Our approach improves upon existing bounds for the algorithms we analyzed, and numerical simulations reveal that our bounds are likely tight. The efficient and automated nature of our analysis makes it a powerful tool for algorithm selection and tuning, and for the discovery of new algorithms as well. - Dissipativity theory for Nesterov’s accelerated method (url, arXiv)
B. Hu and L. Lessard. International Conference on Machine Learning (ICML), 1549–1557, Aug 2017. (Sydney, Australia)abstract
In this paper, we adapt the control theoretic concept of dissipativity theory to provide a natural understanding of Nesterov’s accelerated method. Our theory ties rigorous convergence rate analysis to the physically intuitive notion of energy dissipation. Moreover, dissipativity allows one to efficiently construct Lyapunov functions (either numerically or analytically) by solving a small semidefinite program. Using novel supply rate functions, we show how to recover known rate bounds for Nesterov’s method and we generalize the approach to certify both linear and sublinear rates in a variety of settings. Finally, we link the continuous-time version of dissipativity to recent works on algorithm analysis that use discretizations of ordinary differential equations. - Modeling color preference using color space metrics (doi)
K.B. Schloss, L. Lessard, C. Racey, and A.C. Hurlbert. Vision Research, 151, 99–116, Jul 2017.abstract
Studying color preferences provides a means to discover how perceptual experiences map onto cognitive and affective judgments. A challenge is finding a parsimonious way to describe and predict patterns of color preferences, which are complex with rich individual differences. One approach has been to model color preferences using factors from metric color spaces to establish direct correspondences between dimensions of color and preference. Prior work established that substantial, but not all, variance in color preferences could be captured by weights on color space dimensions using multiple linear regression. The question we address here is whether model fits may be improved by using different color metric specifications. We therefore conducted a large-scale analysis of color space models, and focused in-depth analysis on models that differed in color space (cone-contrast vs. CIELAB), coordinate system within the color space (Cartesian vs. cylindrical), and factor degrees (1st degree only, or 1st and 2nd degree). We used k-fold cross validation to avoid over-fitting the data and to ensure fair comparisons across models. The best model was the 2nd-harmonic Lch model ("LabC Cyl2"). Specified in CIELAB space, it included 1st and 2nd harmonics of hue (capturing opponency in hue preferences and simultaneous liking/disliking of both hues on an opponent axis, respectively), lightness, and chroma. These modeling approaches can be used to characterize and compare patterns for group averages and individuals in future datasets on color preference, or other measures in which correspondences between color appearance and cognitive or affective judgments may exist. - Exponential stability analysis via integral quadratic constraints (arXiv)
R. Boczar, L. Lessard, A. Packard, and B. Recht.abstract
The theory of integral quadratic constraints (IQCs) allows verification of stability and gain-bound properties of systems containing nonlinear or uncertain elements. Gain bounds often imply exponential stability, but it can be challenging to compute useful numerical bounds on the exponential decay rate. This work presents a generalization of the classical IQC results of Megretski and Rantzer [19] that leads to a tractable computational procedure for finding exponential rate certificates that are far less conservative than ones computed from L2 gain bounds alone. An expanded library of IQCs for certifying exponential stability is also provided and the effectiveness of the technique is demonstrated via numerical examples. - Control interpretations for first-order optimization methods (doi, arXiv)
B. Hu and L. Lessard. American Control Conference (ACC), 3114–3119, May 2017. (Seattle, WA)abstract
First-order iterative optimization methods play a fundamental role in large scale optimization and machine learning. This paper presents control interpretations for such optimization methods. First, we give loop-shaping interpretations for several existing optimization methods and show that they are composed of basic control elements such as PID and lag compensators. Next, we apply the small gain theorem to draw a connection between the convergence rate analysis of optimization methods and the input-output gain computations of certain complementary sensitivity functions. These connections suggest that standard classical control synthesis tools may be brought to bear on the design of optimization algorithms.
2016
- Convexity of decentralized controller synthesis (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Automatic Control, 61(10), 3122–3127, Oct 2016.abstract
In decentralized control problems, a standard approach is to specify the set of allowable decentralized controllers as a closed subspace of linear operators. This then induces a corresponding set of Youla parameters. Previous work has shown that quadratic invariance of the controller set implies that the set of Youla parameters is convex. In this paper, we prove the converse. We thereby show that the only decentralized control problems for which the set of Youla parameters is convex are those which are quadratically invariant. We further show that under additional assumptions, quadratic invariance is necessary and sufficient for the set of achievable closed-loop maps to be convex. We give two versions of our results. The first applies to bounded linear operators on a Banach space and the second applies to (possibly unstable) causal LTI systems in discrete or continuous time. - Analysis and design of optimization algorithms via integral quadratic constraints (doi, arXiv)
L. Lessard, B. Recht, and A. Packard. SIAM Journal on Optimization, 26(1), 57–95, Jan 2016.abstract
This paper develops a new framework to analyze and design iterative optimization algorithms built on the notion of integral quadratic constraints (IQCs) from robust control theory. IQCs provide sufficient conditions for the stability of complicated interconnected systems, and these conditions can be checked by semidefinite programming. We discuss how to adapt IQC theory to study optimization algorithms, proving new inequalities about convex functions and providing a version of IQC theory adapted for use by optimization researchers. Using these inequalities, we derive numerical upper bounds on convergence rates for the Gradient method, the Heavy-ball method, Nesterov's accelerated method, and related variants by solving small, simple semidefinite programming problems. We also briefly show how these techniques can be used to search for optimization algorithms with desired performance characteristics, establishing a new methodology for algorithm design.
2015
- Exponential convergence bounds using integral quadratic constraints (doi, arXiv)
R. Boczar, L. Lessard, and B. Recht. IEEE Conference on Decision and Control (CDC), 7516–7521, Dec 2015. (Osaka, Japan)abstract
The theory of integral quadratic constraints (IQCs) allows verification of stability and gain-bound properties of systems containing nonlinear or uncertain elements. Gain bounds often imply exponential stability, but it can be challenging to compute useful numerical bounds on the exponential decay rate. In this work, we present a modification of the classical IQC results of Megretski and Rantzer that leads to a tractable computational procedure for finding exponential rate certificates. We demonstrate the effectiveness of our method via a numerical example. - Compositional performance certification of interconnected systems using ADMM (doi, arXiv)
C. Meissen, L. Lessard, M. Arcak, and A. Packard. Automatica, 61, 55–63, Nov 2015.abstract
A compositional performance certification method is presented for interconnected systems using subsystem dissipativity properties and the interconnection structure. A large-scale optimization problem is formulated to search for the most relevant dissipativity properties. The alternating direction method of multipliers (ADMM) is employed to decompose and solve this problem, and is demonstrated on several examples. - Optimal control for LQG systems on graphs—Part I: structural results (arXiv)
A. Nayyar and L. Lessard.abstract
In this two-part paper, we identify a broad class of decentralized output-feedback LQG systems for which the optimal control strategies have a simple intuitive estimation structure and can be computed efficiently. Roughly, we consider the class of systems for which the coupling of dynamics among subsystems and the inter-controller communication is characterized by the same directed graph. Furthermore, this graph is assumed to be a multitree, that is, its transitive reduction can have at most one directed path connecting each pair of nodes. In this first part, we derive sufficient statistics that may be used to aggregate each controller's growing available information. Each controller must estimate the states of the subsystems that it affects (its descendants) as well as the subsystems that it observes (its ancestors). The optimal control action for a controller is a linear function of the estimate it computes as well as the estimates computed by all of its ancestors. Moreover, these state estimates may be updated recursively, much like a Kalman filter. - Optimal decentralized state-feedback control with sparsity and delays (doi, arXiv)
A. Lamperski and L. Lessard. Automatica, 58, 143–151, Aug 2015.abstract
This work presents the solution to a class of decentralized linear quadratic state-feedback control problems, in which the plant and controller must satisfy the same combination of delay and sparsity constraints. Using a novel decomposition of the noise history, the control problem is split into independent subproblems that are solved using dynamic programming. The approach presented herein both unifies and generalizes many existing results. - Optimal control of two-player systems with output feedback (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Automatic Control, 60(8), 2129–2144, Aug 2015.abstract
In this article, we consider a fundamental decentralized optimal control problem, which we call the two-player problem. Two subsystems are interconnected in a nested information pattern, and output feedback controllers must be designed for each subsystem. Several special cases of this architecture have previously been solved, such as the state-feedback case or the case where the dynamics of both systems are decoupled. In this paper, we present a detailed solution to the general case. The structure of the optimal decentralized controller is reminiscent of that of the optimal centralized controller; each player must estimate the state of the system given their available information and apply static control policies to these estimates to compute the optimal controller. The previously solved cases benefit from a separation between estimation and control which allows one to compute the control and estimation gains separately. This feature is not present in general, and some of the gains must be solved for simultaneously. We show that computing the required coupled estimation and control gains amounts to solving a small system of linear equations. - A general analysis of the convergence of ADMM (url, arXiv)
R. Nishihara, L. Lessard, B. Recht, A. Packard, and M.I. Jordan. International Conference on Machine Learning (ICML), 343–352, Jul 2015. (Lille, France)abstract
We provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex. Our proof is based on a framework for analyzing optimization algorithms introduced in Lessard et al. (2014), reducing algorithm convergence to verifying the stability of a dynamical system. This approach generalizes a number of existing results and obviates any assumptions about specific choices of algorithm parameters. On a numerical example, we demonstrate that minimizing the derived bound on the convergence rate provides a practical approach to selecting algorithm parameters for particular ADMM instances. We complement our upper bound by constructing a nearly-matching lower bound on the worst-case rate of convergence. - Structural results for partially nested LQG systems over graphs (doi)
A. Nayyar and L. Lessard. American Control Conference (ACC), 5457–5464, Jul 2015. (Chicago, IL)abstract
We identify a broad class of decentralized output-feedback LQG systems for which the optimal control strategies have a simple and intuitive estimation structure. We consider cases for which the coupling of dynamics among subsystems and the inter-controller communication are characterized by the same directed graph. For the class of graphs known as multitrees, we show that each controller need only estimate the states of the subsystems it affects (its descendants) as well as the subsystems it observes (its ancestors). The optimal control action for each controller is a linear function of the estimate it computes and the estimates computed by its ancestors. Moreover, all state estimates may be updated recursively, much like a Kalman filter.
2014
- Performance certification of interconnected nonlinear systems using ADMM (doi)
C. Meissen, L. Lessard, M. Arcak, and A. Packard. IEEE Conference on Decision and Control (CDC), 5131–5136, Dec 2014. (Los Angeles, CA)abstract
We present a compositional performance certification method for interconnected systems, using dissipativity properties of the subsystems along with the interconnection structure. To select the most relevant dissipativity properties, we formulate a large-scale optimization problem, and employ the alternating direction method of multipliers (ADMM) for its solution. The dissipativity properties are allowed to depend on an unknown equilibrium, enabling us to certify performance without explicit knowledge of the equilibrium for the interconnection. The effectiveness of the algorithm is demonstrated on two examples, including a model of vehicle platoons. - State-space solution to a minimum-entropy H-infinity optimal control problem with a nested information constraint (doi, arXiv)
L. Lessard. IEEE Conference on Decision and Control (CDC), 4026–4031, Dec 2014. (Los Angeles, CA)abstract
State-space formulas are derived for the minimum-entropy H-infinity controller when the plant and controller are constrained to be block-lower-triangular. Such a controller exists if and only if: the corresponding unstructured problem has a solution, a certain pair of coupled algebraic Riccati equations admits a mutually stabilizing fixed point, and a pair of spectral radius conditions is met. The controller's observer-based structure is also discussed, and a simple numerical approach for solving the coupled Riccati equations is presented. - An algebraic approach to the control of decentralized systems (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Control of Network Systems, 1(4), 308–317, Dec 2014.abstract
Optimal decentralized controller design is notoriously difficult, but recent research has identified large subclasses of such problems that may be convexified and thus are amenable to solution via efficient numerical methods. One recently discovered sufficient condition for convexity is \emph{quadratic invariance} (QI). Despite the simple algebraic characterization of QI, which relates the plant and controller maps, proving convexity of the set of achievable closed-loop maps requires tools from functional analysis. In this work, we present a new formulation of quadratic invariance that is purely algebraic. While our results are similar in flavor to those from traditional QI theory, they do not follow from that body of work. Furthermore, they are applicable to new types of systems that are difficult to treat using functional analysis. Examples discussed include rational transfer matrices, systems with delays, and multidimensional systems. - Performance certification of interconnected systems using decomposition techniques (doi)
C. Meissen, L. Lessard, and A. Packard. American Control Conference (ACC), 5030–5036, Jun 2014. (Portland, OR)abstract
We propose a unified framework for the certification of stability and input-output performance of interconnected dynamical systems. In our approach, we seek local dissipativity certificates for each subsystem such that when they are combined, the performance of the entire interconnected system is certified. We also demonstrate, by the use of numerical simulations, that the Alternating Direction Method of Multipliers (ADMM) is a promising computational approach for solving such problems.
2013
- Structural results and explicit solution for two-player LQG systems on a finite time horizon (doi, arXiv)
L. Lessard and A. Nayyar. IEEE Conference on Decision and Control (CDC), 6542–6549, Dec 2013. (Florence, Italy)abstract
It is well-known that linear dynamical systems with Gaussian noise and quadratic cost (LQG) satisfy a separation principle. Finding the optimal controller amounts to solving separate dual problems; one for control and one for estimation. For the discrete-time finite-horizon case, each problem is a simple forward or backward recursion. In this paper, we consider a generalization of the LQG problem in which there are two controllers. Each controller is responsible for one of two system inputs, but has access to different subsets of the available measurements. Our paper has three main contributions. First, we prove a fundamental structural result: sufficient statistics for the controllers can be expressed as conditional means of the global state. Second, we give explicit state-space formulae for the optimal controller. These formulae are reminiscent of the classical LQG solution with dual forward and backward recursions, but with the important difference that they are intricately coupled. Lastly, we show how these recursions can be solved efficiently, with computational complexity comparable to that of the centralized problem. - A separation principle for decentralized state-feedback optimal control (doi)
L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 528–534, Oct 2013. (Monticello, IL)abstract
A cooperative control problem is considered in which dynamically decoupled subsystems must control their own states through state feedback in order to optimize a global quadratic cost. The states of the subsystems are coupled only through the cost function and correlated external disturbances. The architecture is truly decentralized; no communication between subsystems or their controllers is permitted. The main result of this paper is that the optimal decentralized controller satisfies a new separation principle that is strikingly similar to the celebrated result from centralized optimal control theory, but does not appear to follow from it. Roughly speaking, the optimal decentralized control strategy for each subsystem is the product of a static control gain and a global state estimate, and each can be separately computed. - On structured realizability and stabilizability of linear systems (doi, arXiv)
L. Lessard, M. Kristalny, and A. Rantzer. American Control Conference (ACC), 5694–5700, Jun 2013. (Washington, D.C.)abstract
We study the notion of structured realizability for linear systems dened over graphs. A stabilizable and detectable realization is structured if the state-space matrices inherit the sparsity pattern of the adjacency matrix of the associated graph. In this paper, we demonstrate that not every structured transfer matrix has a structured realization and we reveal the practical meaning of this fact. We also uncover a close connection between the structured realizability of a plant and whether the plant can be stabilized by a structured controller. In particular, we show that a structured stabilizing controller can only exist when the plant admits a structured realization. Finally, we give a parameterization of all structured stabilizing controllers and show that they always have structured realizations.
2012
- Decentralized LQG control of systems with a broadcast architecture (doi)
L. Lessard. IEEE Conference on Decision and Control (CDC), 6241–6246, Dec 2012. (Maui, HI)abstract
In this paper, we consider dynamical subsystems interconnected in a broadcast architecture. In the broadcast-out case, the root node can affect several leaf nodes, but the leaf nodes do not affect any other nodes. Each subsystem is locally controlled via output feedback, and the controllers can communicate according to a structure that parallels the dynamic coupling between subsystems. Explicit state-space realizations for the optimal controllers are derived using a spectral factorization approach. An interpretation of the controller states is also provided in terms of optimal state estimators. We also address the dual broadcast-in case, where there is a single leaf node affected by multiple root nodes. - Optimal control of a fully decentralized quadratic regulator (doi)
L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 48–54, Oct 2012. (Monticello, IL)abstract
In this paper, we consider a fully decentralized control problem with two dynamically decoupled agents. The objective is to design a state-feedback controller for each agent such that a global quadratic cost is minimized. No communication, explicit or implicit, is permitted between the agents. However, the performance of the agents is coupled via the cost function as well as the process noise. We provide an explicit state-space construction of the optimal controllers, showing that the optimal controllers are dynamic, where the number of states depends on the joint covariance matrix of the process noise. The key step is a novel decomposition of the noise covariance matrix, which allows the convex program associated with the controller synthesis to be split into simpler problems and thereby solved. - Optimal state-feedback control under sparsity and delay constraints (doi)
A. Lamperski and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 204–209, Sep 2012. (Santa Barbara, CA)abstract
This paper presents the solution to a general decentralized state-feedback problem, in which the plant and controller must satisfy the same combination of delay constraints and sparsity constraints. The control problem is decomposed into independent subproblems, which are solved by dynamic programming. In special cases with only sparsity or only delay constraints, the controller reduces to existing solutions. - Optimal controller synthesis for the decentralized two-player problem with output feedback (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 6314–6321, Jun 2012. (Montréal, Canada)abstract
In this paper, we present a controller synthesis algorithm for a decentralized control problem. We consider an architecture in which there are two interconnected linear subsystems. Both controllers seek to optimize a global quadratic cost, despite having access to different subsets of the available measurements. Many special cases of this problem have previously been solved, most notably the state-feedback case. The generalization to output-feedback is nontrivial, as the classical separation principle does not hold. Herein, we present the first explicit state-space realization for an optimal controller for the general two-player problem.
2011 and earlier
- A state-space solution to the two-player decentralized optimal control problem (doi)
L. Lessard and S. Lall. Allerton Conference on Communication, Control, and Computing (Allerton), 1559–1564, Sep 2011. (Monticello, IL)abstract
In this paper, we present an explicit state-space solution to the two-player decentralized optimal control problem. In this problem, there are two interconnected linear systems that seek to optimize a global quadratic cost. Both controllers perform output feedback, but they have access to different subsets of the available measurements. The optimal controller, which was not previously known, has a state dimension equal to twice the state dimension of the original system. - Tractability of complex control systems (url)
L. Lessard. Ph.D. Thesis, Stanford University, Aug 2011.abstract
This thesis is divided into two main parts. In the first part, we consider the problem of efficiently computing wavefront estimates for use in adaptive optics hardware on ground-based telescopes. Our contribution is a warm-started single-iteration multigrid algorithm that performs as well as conventional vector-matrix-multiplication methods, but at a fraction of the computational cost. We used numerical simulations to compare our algorithm to a variety of other published methods, and validated our findings at the Palomar Observatory. In the second part, we consider feedback control subject to an information constraint. Using a novel algebraic framework, we are able to prove many structural results, including a new convexity result, in a natural and purely algebraic way. We also develop a new condition we call internal quadratic invariance, under which the controller synthesis can be cast as a convex optimization problem. This describes the most general class of tractable decentralized control problems known to date. Both parts of the thesis fit into the broader question of tractability of complex systems. In the first part we look at a practical example which is difficult because of the large number of sensors and actuators. In the second part, we look at decentralized control, which is difficult because of the non-classical information constraint. - Quadratic invariance is necessary and sufficient for convexity (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 5360–5362, Jun 2011. (San Francisco, CA)abstract
In decentralized control problems, a standard approach is to specify the set of allowable decentralized controllers as a closed subspace of linear operators. This then induces a corresponding set of of Youla parameters. Previous work has shown that quadratic invariance of the controller set implies that the the set of Youla parameters is convex. In this short note, we prove the converse. We thereby show that the only decentralized control problems for which the set of Youla parameters is convex are those which are quadratically invariant. - An algebraic framework for quadratic invariance (doi)
L. Lessard and S. Lall. IEEE Conference on Decision and Control (CDC), 2698–2703, Dec 2010. (Atlanta, GA)abstract
In this paper, we present a general algebraic framework for analysing decentralized control systems. We consider systems defined by linear fractional functions over a commutative ring. This provides a general algebraic formulation and proof of the main results of quadratic invariance, as well as naturally covering rational multivariable systems, systems with delays, and multidimensional systems. The approach extends to the extended class of internally quadratically invariant systems. - Internal quadratic invariance and decentralized control (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 5596–5601, Jun 2010. (Baltimore, MD)abstract
For decentralized control systems with quadratically invariant information constraints, the set of achievable closed-loop maps is affine, and thus the associated minimum-norm controller synthesis problem is amenable to a convex programming approach. In this paper, we show that a strictly broader class of problems we call internally quadratically invariant, also yields an affine set of achievable closed-loop maps. We treat systems represented by rational as well as proper rational transfer functions and present an illustrative example. - Reduction of decentralized control problems to tractable representations (doi)
L. Lessard and S. Lall. IEEE Conference on Decision and Control (CDC), 1621–1626, Dec 2009. (Shanghai, China)abstract
For decentralized control problems with quadratically invariant information constraints, the optimal controller may be found efficiently. In this paper, we show that there are systems which are not quadratically invariant but reduce to systems that are. We call the requisite property internal quadratic invariance. We present an associated reduction procedure, and illustrate our method with examples. - Experimental validation of single-iteration multigrid wavefront reconstruction at the Palomar Observatory (doi)
L. Lessard, D. MacMynowski, M. West, A. Bouchez, and S. Lall. Optics Letters, 33(18), 2047–2049, Sep 2008.abstract
Single-iteration multigrid (SIMG) wavefront reconstruction schemes were implemented and validated on the adaptive optics system at the Hale 5.1m telescope at the Palomar Observatory. Results indicate that even the simplest such method produces a performance indistinguishable from that of the standard least-squares reconstructor for both bright and dim guide stars. SIMG provides a dramatic reduction in computational cost when compared to vector–matrix multiplication and can be implemented in parallel, making it the obvious choice for reconstruction in future large-scale adaptive optics systems. - Warm-started wavefront reconstruction for adaptive optics (doi)
L. Lessard, M. West, D. MacMynowski, and S. Lall. Journal of the Optical Society of America A, 25(5), 1147–1155, May 2008.abstract
Future extreme adaptive optics (ExAO) systems have been suggested with up to 105 sensors and actuators. We analyze the computational speed of iterative reconstruction algorithms for such large systems. We compare a total of 15 different scalable methods, including multigrid, preconditioned conjugate-gradient, and several new variants of these. Simulations on a 128×128 square sensor/actuator geometry using Taylor frozen-flow dynamics are carried out using both open-loop and closed-loop measurements, and algorithms are compared on a basis of the mean squared error and floating-point multiplications required. We also investigate the use of warm starting, where the most recent estimate is used to initialize the iterative scheme. In open-loop estimation or pseudo-open-loop control, warm starting provides a significant computational speedup; almost every algorithm tested converges in one iteration. In a standard closed-loop implementation, using a single iteration per time step, most algorithms give the minimum error even in cold start, and every algorithm gives the minimum error if warm started. The best algorithm is therefore the one with the smallest computational cost per iteration, not necessarily the one with the best quasi-static performance.
Robust Control and Optimization
- The fastest known first-order method for minimizing twice continuously differentiable smooth strongly convex functions (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Control Systems Letters, 9, 655–660, Jun 2025.abstract
We consider iterative gradient-based optimization algorithms applied to functions that are smooth and strongly convex. The fastest globally convergent algorithm for this class of functions is the Triple Momentum (TM) method. We show that if the objective function is also twice continuously differentiable, a new, faster algorithm emerges, which we call C²-Momentum (C2M). We prove that C2M is globally convergent and that its worst-case convergence rate is strictly faster than that of TM, with no additional computational cost. We validate our theoretical findings with numerical examples, demonstrating that C2M outperforms TM when the objective function is twice continuously differentiable. - Algebraic characterization of equivalence between optimization algorithms (arXiv)
L. Lessard and M. Udell. Preprint.abstract
When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize equivalence between algorithms that use different oracles that are related via a linear fractional transformation. Prominent examples include linear transformations and function conjugation. - A tutorial on the structure of distributed optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3009–3014, Dec 2023. (Singapore)abstract
We consider the distributed optimization problem for a multi-agent system. Here, multiple agents cooperatively optimize an objective by sharing information through a communication network and performing computations. In this tutorial, we provide an overview of the problem, describe the structure of its algorithms, and use simulations to illustrate some algorithmic properties based on this structure. - Automated Lyapunov analysis of primal-dual optimization algorithms: An interpolation approach (doi, arXiv)
B. Van Scoy, J.W. Simpson-Porco, and L. Lessard. IEEE Conference on Decision and Control (CDC), 1306–1311, Dec 2023. (Singapore)abstract
Primal-dual algorithms are frequently used for iteratively solving large-scale convex optimization problems. The analysis of such algorithms is usually done on a case-by-case basis, and the resulting guaranteed rates of convergence can be conservative. Here we consider a class of first-order algorithms for linearly constrained convex optimization problems, and provide a linear matrix inequality (LMI) analysis framework for certifying worst-case exponential convergence rates. Our approach builds on recent results for interpolation of convex functions and linear operators, and our LMI directly constructs a Lyapunov function certifying the guaranteed convergence rate. By comparing to rates established in the literature, we show that our approach can certify significantly faster convergence for this family of algorithms. - A tutorial on a Lyapunov-based approach to the analysis of iterative optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3003–3008, Dec 2023. (Singapore)abstract
Iterative gradient-based optimization algorithms are widely used to solve difficult or large-scale optimization problems. There are many algorithms to choose from, such as gradient descent and its accelerated variants such as Polyak's Heavy Ball method or Nesterov's Fast Gradient method. It has long been observed that iterative algorithms can be viewed as dynamical systems, and more recently, as robust controllers. Here, the "uncertainty" in the dynamics is the gradient of the function being optimized. Therefore, worst-case or average-case performance can be analyzed using tools from robust control theory, such as integral quadratic constraints (IQCs). In this tutorial paper, we show how such an analysis can be carried out using an alternative Lyapunov-based approach. This approach recovers the same performance bounds as with IQCs, but with the added benefit of constructing a Lyapunov function. - Absolute stability via lifting and interpolation (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 6217–6223, Dec 2022. (Cancún, Mexico)abstract
We revisit the classical problem of absolute stability; assessing the robust stability of a given linear time-invariant (LTI) plant in feedback with a nonlinearity belonging to some given function class. Standard results typically take the form of sufficient conditions on the LTI plant, the least conservative of which are based on O'Shea--Zames--Falb multipliers. We present an alternative analysis based on lifting and interpolation that directly constructs a Lyapunov function that certifies absolute stability without resorting to frequency-domain inequalities or integral quadratic constraints. In particular, we use linear matrix inequalities to search over Lyapunov functions that are quadratic in the iterates and linear in the corresponding function values of the system in a lifted space. We show that our approach recovers state-of-the-art results on a set of benchmark problems. - A universal decomposition for distributed optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Control Systems Letters, 6, 3044–3049, Jun 2022.abstract
In the distributed optimization problem for a multi-agent system, each agent knows a local function and must find a minimizer of the sum of all agents' local functions by performing a combination of local gradient evaluations and communicating information with neighboring agents. We prove that every distributed optimization algorithm can be factored into a centralized optimization method and a second-order consensus estimator, effectively separating the "optimization" and "consensus" tasks. We illustrate this fact by providing the decomposition for many recently proposed distributed optimization algorithms. Conversely, we prove that any optimization method that converges in the centralized setting can be combined with any second-order consensus estimator to form a distributed optimization algorithm that converges in the multi-agent setting. Finally, we describe how our decomposition may lead to a more systematic algorithm design methodology. - The analysis of optimization algorithms: A dissipativity approach (doi, arXiv)
L. Lessard. IEEE Control Systems Magazine, 42(3), 58–72, Jun 2022.abstract
Optimization problems in engineering and applied mathematics are typically solved in an iterative fashion, by systematically adjusting the variables of interest until an adequate solution is found. The iterative algorithms that govern these systematic adjustments can be viewed as a control system. In control systems, the output in measured and the input is adjusted using feedback to drive the error to zero. Similarly, in iterative algorithms, the optimization objective is evaluated and the candidate solution is adjusted to drive it toward the optimal point. Choosing an algorithm that works well for a variety of optimization problems is akin to robust controller design. Just as dissipativity theory can be used to analyze the stability properties of control systems, it can also be used to analyze the convergence properties of iterative algorithms. By defining an appropriate notion of "energy" that dissipates with every iteration of the algorithm, the convergence properties of the algorithm can be characterized. This article formalizes the connection between iterative algorithms and control systems and shows through examples how dissipativity theory can be used to analyze the performance of many classes of optimization algorithms. This control-theoretic viewpoint enables the selection and tuning of optimization algorithms to be performed in an automated and systematic way. - Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization (url, arXiv)
G. Zhang, Y. Wang, L. Lessard, and R. Grosse. International Conference on Artificial Intelligence and Statistics (AISTATS), 7659–7679, Mar 2022. (Valencia, Spain, virtual)abstract
Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (Sim-GDA) in many settings. We prove that Alt-GDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the first result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate globally for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms. - The speed-robustness trade-off for first-order methods with additive gradient noise (arXiv)
B. Van Scoy and L. Lessard. Preprint.abstract
We study the trade-off between convergence rate and sensitivity to stochastic additive gradient noise for first-order optimization methods. Ordinary Gradient Descent (GD) can be made fast-and-sensitive or slow-and-robust by increasing or decreasing the stepsize, respectively. However, it is not clear how such a trade-off can be navigated when working with accelerated methods such as Polyak's Heavy Ball (HB) or Nesterov's Fast Gradient (FG) methods, or whether any of these methods can achieve an optimal trade-off. We consider three classes of functions: (1) strongly convex quadratics, (2) smooth strongly convex functions, and (3) nonconvex functions that satisfy a weak notion of strong convexity. For each function class, we present a tractable way to compute convergence rate and sensitivity to additive gradient noise for a broad family of first-order methods, and we present near-Pareto-optimal algorithm designs. Each design consists of a simple analytic update rule with two states of memory, similar to HB and FG. Moreover, each design has a scalar tuning parameter that explicitly trades off convergence rate and sensitivity to additive gradient noise. When tuned as aggressively as possible, our proposed algorithms recover the algorithms with fastest-known convergence rates for each function class. When tuned to be more robust, our algorithms are novel and provide a practical way to control noise sensitivity while maintaining the fastest possible convergence rate. We validate the performance and near-optimality of our designs through numerous numerical simulations. - A unified analysis of first-order methods for smooth games via integral quadratic constraints (url, arXiv)
G. Zhang, X. Bao, L. Lessard, and R. Grosse. Journal of Machine Learning Research, 22(103), 1–39, Jun 2021.abstract
The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements. In this work, we adapt the IQC theory to study first-order methods for smooth and strongly-monotone games and show how to design tailored quadratic constraints to get tight upper bounds of convergence rates. Using this framework, we recover the existing bound for the gradient method (GD), derive sharper bounds for the proximal point method (PPM) and optimistic gradient method (OG), and provide for the first time a global convergence rate for the negative momentum method (NM) with an iteration complexity O(kappa^1.5), which matches its known lower bound. In addition, for time-varying systems, we prove that the gradient method with optimal step size achieves the fastest provable worst-case convergence rate with quadratic Lyapunov functions. Finally, we further extend our analysis to stochastic games and study the impact of multiplicative noise on different algorithms. We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch (in contrast with the stochastic strongly-convex optimization setting, where such acceleration has been demonstrated). However, we exhibit an algorithm which achieves acceleration with two gradient queries per batch. - Toward a scalable upper bound for a CVaR-LQ problem (doi, arXiv)
M.P. Chapman and L. Lessard. IEEE Control Systems Letters, 6, 920–925, Jun 2021.abstract
We consider a linear-quadratic optimal control problem in discrete time with distributional ambiguity, where the random cumulative quadratic cost is assessed via the Conditional Value-at-Risk (CVaR) functional. We take steps toward deriving a scalable value iteration algorithm to upper-bound the solution to this problem. A remarkable finding is that the value functions depend on positive definite matrices, which are computed by a Riccati-like recursion. - An automatic system to detect equivalence between algorithms (arXiv)
S. Zhao, L. Lessard, and M. Udell.abstract
When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize some algorithm transformations including permutations of the update equations, repetition of the iteration, and conjugation of some of the function oracles in the algorithm. To support the paper, we have developed a software package named Linnaeus that implements the framework to identify other iterative algorithms that are equivalent to an input algorithm. More broadly, this framework and software advances the goal of making mathematics searchable. - Analysis of biased stochastic gradient descent using sequential semidefinite programs (doi, arXiv)
B. Hu, P. Seiler, and L. Lessard. Mathematical Programming, 187(1), 384–408, May 2021.abstract
We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions. - Systematic analysis of distributed optimization algorithms over jointly-connected networks (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3096–3101, Dec 2020. (Jeju Island, Republic of Korea, virtual)abstract
We consider the distributed optimization problem, where a group of agents work together to optimize a common objective by communicating with neighboring agents and performing local computations. Using tools from robust control, we develop a systematic analysis of a large class of distributed algorithms for solving this problem without using restrictive assumptions on the communication network. In particular, we assume only that the network is jointly connected over a finite time horizon (commonly referred to as B-connectivity), which does not require connectivity at each time instant. When applied to the distributed algorithm DIGing, our bounds are orders of magnitude tighter than those available in the literature. - Analysis and design of first-order distributed optimization algorithms over time-varying graphs (doi, arXiv)
A. Sundararajan, B. Van Scoy, and L. Lessard. IEEE Transactions on Control of Network Systems, 7(4), 1597–1608, Dec 2020.abstract
This work concerns the analysis and design of distributed first-order optimization algorithms over time-varying graphs. The goal of such algorithms is to optimize a global function that is the average of local functions using only local computations and communications. Several different algorithms have been proposed that achieve linear convergence to the global optimum when the local functions are strongly convex. We provide a unified analysis that yields a worst-case linear convergence rate as a function of the condition number of the local functions, the spectral gap of the graph, and the parameters of the algorithm. The framework requires solving a small semidefinite program whose size is fixed; it does not depend on the number of local functions or the dimension of the domain. The result is a computationally efficient method for distributed algorithm analysis that enables the rapid comparison, selection, and tuning of algorithms. Finally, we propose a new algorithm, which we call SVL, that is easily implementable and achieves the fastest possible worst-case convergence rate among all algorithms in the family we considered. We support our theoretical analysis with numerical experiments that generate worst-case examples demonstrating the effectiveness of SVL. - Direct synthesis of iterative algorithms with bounds on achievable worst-case convergence rate (doi, arXiv)
L. Lessard and P. Seiler. American Control Conference (ACC), 119–125, Jul 2020. (Denver, CO, virtual)abstract
Iterative first-order methods such as gradient descent and its variants are widely used for solving optimization and machine learning problems. There has been recent interest in analytic or numerically efficient methods for computing worst-case performance bounds for such algorithms, for example over the class of strongly convex loss functions. A popular approach is to assume the algorithm has a fixed size (fixed dimension, or memory) and that its structure is parameterized by one or two hyperparameters, for example a learning rate and a momentum parameter. Then, a Lyapunov function is sought to certify robust stability and subsequent optimization can be performed to find optimal hyperparameter tunings. In the present work, we instead fix the constraints that characterize the loss function and apply techniques from robust control synthesis to directly search over algorithms. This approach yields stronger results than those previously available, since the bounds produced hold over algorithms with an arbitrary, but finite, amount of memory rather than just holding for algorithms with a prescribed structure. - Unified necessary and sufficient conditions for the robust stability of interconnected sector-bounded systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Conference on Decision and Control (CDC), 7690–7695, Dec 2019. (Nice, France)abstract
Classical conditions for ensuring the robust stability of a linear system in feedback with a sector-bounded nonlinearity include the circle, small gain, passivity, and conicity theorems. In this work, we present a similar stability condition, but expressed in terms of operators on a semi-inner product space. This increased generality leads to a clean result that can be specialized in a variety of ways. First, we show how to recover both sufficient and necessary-and-sufficient versions of the aforementioned classical results. Second, we show that suitably choosing the semi-inner product space leads to a new necessary and sufficient condition for exponential stability with a given convergence rate. Finally, in the spirit of classical robust stability analysis, we provide linear matrix inequalities that allow for the efficient verification of the conditions of our theorem. - Integral quadratic constraints: Exact convergence rates and worst-case trajectories (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 7677–7682, Dec 2019. (Nice, France)abstract
We consider a linear time-invariant system in discrete time where the state and input signals satisfy a set of integral quadratic constraints (IQCs). Analogous to the autonomous linear systems case, we define a new notion of spectral radius that exactly characterizes stability of this system. In particular, (i) when the spectral radius is less than one, we show that the system is asymptotically stable for all trajectories that satisfy the IQCs, and (ii) when the spectral radius is equal to one, we construct an unstable trajectory that satisfies the IQCs. Furthermore, we connect our new definition of the spectral radius to the existing literature on IQCs. - A distributed optimization algorithm over time-varying graphs with efficient gradient evaluations (doi, arXiv)
B. Van Scoy and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 357–362, Sep 2019. (Chicago, IL)abstract
We propose an algorithm for distributed optimization over time-varying communication networks. Our algorithm uses an optimized ratio between the number of rounds of communication and gradient evaluations to achieve fast convergence. The iterates converge to the global optimizer at the same rate as centralized gradient descent when measured in terms of the number of gradient evaluations while using the minimum number of communications to do so. Furthermore, the iterates converge at a near-optimal rate when measured in terms of the number of communication rounds. We compare our algorithm with several other known algorithms on a distributed target localization problem. - A canonical form for first-order distributed optimization algorithms (doi, arXiv)
A. Sundararajan, B. Van Scoy, and L. Lessard. American Control Conference (ACC), 4075–4080, Jul 2019. (Philadelphia, PA)abstract
We consider the distributed optimization problem in which a network of agents aims to minimize the average of local functions. To solve this problem, several algorithms have recently been proposed wherein agents perform various combinations of communication with neighbors, local gradient computations, and updates to local state variables. In this paper, we present a canonical form that characterizes any first-order distributed algorithm that can be implemented using a single round of communication and gradient computation per iteration, and where each agent stores up to two state variables. The canonical form features a minimal set of parameters that are both unique and expressive enough to capture any distributed algorithm in this class. The generic nature of our canonical form enables the systematic analysis and design of distributed optimization algorithms. - Lyapunov functions for first-order methods: Tight automated convergence guarantees (url, arXiv)
A. Taylor, B. Van Scoy, and L. Lessard. International Conference on Machine Learning (ICML), 4897–4906, Jul 2018. (Stockholm, Sweden)abstract
We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation problems (PEP, due to Drori & Teboulle (2014)) and integral quadratic constraints (IQC, due to Lessard et al. (2016)), and relies on convex interpolation (due to Taylor et al. (2017c;b)). - Dissipativity theory for accelerating stochastic variance reduction: A unified analysis of SVRG and Katyusha using semidefinite programs (url, arXiv)
B. Hu, S. Wright, and L. Lessard. International Conference on Machine Learning (ICML), 2038–2047, Jul 2018. (Stockholm, Sweden)abstract
Techniques for reducing the variance of gradient estimates used in stochastic programming algorithms for convex finite-sum problems have received a great deal of attention in recent years. By leveraging dissipativity theory from control, we provide a new perspective on two important variance-reduction algorithms: SVRG and its direct accelerated variant Katyusha. Our perspective provides a physically intuitive understanding of the behavior of SVRG-like methods via a principle of energy conservation. The tools discussed here allow us to automate the convergence analysis of SVRG-like methods by capturing their essential properties in small semidefinite programs amenable to standard analysis and computational techniques. Our approach recovers existing convergence results for SVRG and Katyusha and generalizes the theory to alternative parameter choices. We also discuss how our approach complements the linear coupling technique. Our combination of perspectives leads to a better understanding of accelerated variance-reduced stochastic methods for finite-sum problems. - A robust accelerated optimization algorithm for strongly convex functions (doi, arXiv)
S. Cyrus, B. Hu, B. Van Scoy, and L. Lessard. American Control Conference (ACC), 1376–1381, Jun 2018. (Miwaukee, WI)abstract
This work proposes an accelerated first-order algorithm we call the Robust Momentum Method for optimizing smooth strongly convex functions. The algorithm has a single scalar parameter that can be tuned to trade off robustness to gradient noise versus worst-case convergence rate. At one extreme, the algorithm is faster than Nesterov's Fast Gradient Method by a constant factor but more fragile to noise. At the other extreme, the algorithm reduces to the Gradient Method and is very robust to noise. The algorithm design technique is inspired by methods from classical control theory and the resulting algorithm has a simple analytical form. Algorithm performance is verified on a series of numerical simulations in both noise-free and relative gradient noise cases. - Robust convergence analysis of distributed optimization algorithms (doi)
A. Sundararajan, B. Hu, and L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 1206–1212, Oct 2017. (Monticello IL)abstract
We present a unified framework for analyzing the convergence of distributed optimization algorithms by formulating a semidefinite program (SDP) which can be efficiently solved to bound the linear rate of convergence. Two different SDP formulations are considered. First, we formulate an SDP that depends explicitly on the gossip matrix of the network graph. This result provides bounds that depend explicitly on the graph topology, but the SDP dimension scales with the size of the graph. Second, we formulate an SDP that depends implicitly on the gossip matrix via its spectral gap. This result provides coarser bounds, but yields a small SDP that is independent of graph size. Our approach improves upon existing bounds for the algorithms we analyzed, and numerical simulations reveal that our bounds are likely tight. The efficient and automated nature of our analysis makes it a powerful tool for algorithm selection and tuning, and for the discovery of new algorithms as well. - Dissipativity theory for Nesterov’s accelerated method (url, arXiv)
B. Hu and L. Lessard. International Conference on Machine Learning (ICML), 1549–1557, Aug 2017. (Sydney, Australia)abstract
In this paper, we adapt the control theoretic concept of dissipativity theory to provide a natural understanding of Nesterov’s accelerated method. Our theory ties rigorous convergence rate analysis to the physically intuitive notion of energy dissipation. Moreover, dissipativity allows one to efficiently construct Lyapunov functions (either numerically or analytically) by solving a small semidefinite program. Using novel supply rate functions, we show how to recover known rate bounds for Nesterov’s method and we generalize the approach to certify both linear and sublinear rates in a variety of settings. Finally, we link the continuous-time version of dissipativity to recent works on algorithm analysis that use discretizations of ordinary differential equations. - Exponential stability analysis via integral quadratic constraints (arXiv)
R. Boczar, L. Lessard, A. Packard, and B. Recht.abstract
The theory of integral quadratic constraints (IQCs) allows verification of stability and gain-bound properties of systems containing nonlinear or uncertain elements. Gain bounds often imply exponential stability, but it can be challenging to compute useful numerical bounds on the exponential decay rate. This work presents a generalization of the classical IQC results of Megretski and Rantzer [19] that leads to a tractable computational procedure for finding exponential rate certificates that are far less conservative than ones computed from L2 gain bounds alone. An expanded library of IQCs for certifying exponential stability is also provided and the effectiveness of the technique is demonstrated via numerical examples. - Control interpretations for first-order optimization methods (doi, arXiv)
B. Hu and L. Lessard. American Control Conference (ACC), 3114–3119, May 2017. (Seattle, WA)abstract
First-order iterative optimization methods play a fundamental role in large scale optimization and machine learning. This paper presents control interpretations for such optimization methods. First, we give loop-shaping interpretations for several existing optimization methods and show that they are composed of basic control elements such as PID and lag compensators. Next, we apply the small gain theorem to draw a connection between the convergence rate analysis of optimization methods and the input-output gain computations of certain complementary sensitivity functions. These connections suggest that standard classical control synthesis tools may be brought to bear on the design of optimization algorithms. - Analysis and design of optimization algorithms via integral quadratic constraints (doi, arXiv)
L. Lessard, B. Recht, and A. Packard. SIAM Journal on Optimization, 26(1), 57–95, Jan 2016.abstract
This paper develops a new framework to analyze and design iterative optimization algorithms built on the notion of integral quadratic constraints (IQCs) from robust control theory. IQCs provide sufficient conditions for the stability of complicated interconnected systems, and these conditions can be checked by semidefinite programming. We discuss how to adapt IQC theory to study optimization algorithms, proving new inequalities about convex functions and providing a version of IQC theory adapted for use by optimization researchers. Using these inequalities, we derive numerical upper bounds on convergence rates for the Gradient method, the Heavy-ball method, Nesterov's accelerated method, and related variants by solving small, simple semidefinite programming problems. We also briefly show how these techniques can be used to search for optimization algorithms with desired performance characteristics, establishing a new methodology for algorithm design. - Exponential convergence bounds using integral quadratic constraints (doi, arXiv)
R. Boczar, L. Lessard, and B. Recht. IEEE Conference on Decision and Control (CDC), 7516–7521, Dec 2015. (Osaka, Japan)abstract
The theory of integral quadratic constraints (IQCs) allows verification of stability and gain-bound properties of systems containing nonlinear or uncertain elements. Gain bounds often imply exponential stability, but it can be challenging to compute useful numerical bounds on the exponential decay rate. In this work, we present a modification of the classical IQC results of Megretski and Rantzer that leads to a tractable computational procedure for finding exponential rate certificates. We demonstrate the effectiveness of our method via a numerical example. - A general analysis of the convergence of ADMM (url, arXiv)
R. Nishihara, L. Lessard, B. Recht, A. Packard, and M.I. Jordan. International Conference on Machine Learning (ICML), 343–352, Jul 2015. (Lille, France)abstract
We provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex. Our proof is based on a framework for analyzing optimization algorithms introduced in Lessard et al. (2014), reducing algorithm convergence to verifying the stability of a dynamical system. This approach generalizes a number of existing results and obviates any assumptions about specific choices of algorithm parameters. On a numerical example, we demonstrate that minimizing the derived bound on the convergence rate provides a practical approach to selecting algorithm parameters for particular ADMM instances. We complement our upper bound by constructing a nearly-matching lower bound on the worst-case rate of convergence.
Interconnected Systems
- Generalized necessary and sufficient robust boundedness results for feedback systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Transactions on Automatic Control, 68(9), 5693–5698, Sep 2023.abstract
Classical conditions for ensuring the robust stability of a system in feedback with a nonlinearity include passivity, small gain, circle, and conicity theorems. We present a generalized and unified version of these results in an arbitrary semi-inner product space, which avoids many of the technicalities that arise when working in traditional extended spaces. Our general formulation clarifies when the sufficient conditions for robust stability are also necessary, and we show how to construct worst-case scenarios when the sufficient conditions fail to hold. Finally, we show how our general result can be specialized to recover a wide variety of existing results, and explain how properties such as boundedness, causality, linearity, and time-invariance emerge as a natural consequence. - Unified necessary and sufficient conditions for the robust stability of interconnected sector-bounded systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Conference on Decision and Control (CDC), 7690–7695, Dec 2019. (Nice, France)abstract
Classical conditions for ensuring the robust stability of a linear system in feedback with a sector-bounded nonlinearity include the circle, small gain, passivity, and conicity theorems. In this work, we present a similar stability condition, but expressed in terms of operators on a semi-inner product space. This increased generality leads to a clean result that can be specialized in a variety of ways. First, we show how to recover both sufficient and necessary-and-sufficient versions of the aforementioned classical results. Second, we show that suitably choosing the semi-inner product space leads to a new necessary and sufficient condition for exponential stability with a given convergence rate. Finally, in the spirit of classical robust stability analysis, we provide linear matrix inequalities that allow for the efficient verification of the conditions of our theorem. - Compositional performance certification of interconnected systems using ADMM (doi, arXiv)
C. Meissen, L. Lessard, M. Arcak, and A. Packard. Automatica, 61, 55–63, Nov 2015.abstract
A compositional performance certification method is presented for interconnected systems using subsystem dissipativity properties and the interconnection structure. A large-scale optimization problem is formulated to search for the most relevant dissipativity properties. The alternating direction method of multipliers (ADMM) is employed to decompose and solve this problem, and is demonstrated on several examples. - Performance certification of interconnected nonlinear systems using ADMM (doi)
C. Meissen, L. Lessard, M. Arcak, and A. Packard. IEEE Conference on Decision and Control (CDC), 5131–5136, Dec 2014. (Los Angeles, CA)abstract
We present a compositional performance certification method for interconnected systems, using dissipativity properties of the subsystems along with the interconnection structure. To select the most relevant dissipativity properties, we formulate a large-scale optimization problem, and employ the alternating direction method of multipliers (ADMM) for its solution. The dissipativity properties are allowed to depend on an unknown equilibrium, enabling us to certify performance without explicit knowledge of the equilibrium for the interconnection. The effectiveness of the algorithm is demonstrated on two examples, including a model of vehicle platoons. - Performance certification of interconnected systems using decomposition techniques (doi)
C. Meissen, L. Lessard, and A. Packard. American Control Conference (ACC), 5030–5036, Jun 2014. (Portland, OR)abstract
We propose a unified framework for the certification of stability and input-output performance of interconnected dynamical systems. In our approach, we seek local dissipativity certificates for each subsystem such that when they are combined, the performance of the entire interconnected system is certified. We also demonstrate, by the use of numerical simulations, that the Alternating Direction Method of Multipliers (ADMM) is a promising computational approach for solving such problems.
Optimal Decentralized Control
- Optimal control of multi-agent systems with processing delays (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Transactions on Automatic Control, 69(8), 5141–5153, Aug 2024.abstract
In this article, we consider a cooperative control problem involving dynamically decoupled linear plants. The (output-feedback) controllers for each plant communicate with each other according to a fixed and known network topology, and each transmission incurs a fixed continuous-time processing delay. We provide an explicit closed-form expression for the optimal decentralized controller and its associated cost under these communication constraints and standard linear quadratic Gaussian (LQG) assumptions for the plants and cost function. We find the exact solution without discretizing or otherwise approximating the delays. We also present an implementation of each sub-controller that is efficiently computable, and is composed of standard finite-dimensional linear time-invariant (LTI) and finite impulse response (FIR) components, and has an intuitive observer-regulator architecture reminiscent of the classical separation principle. - Guaranteed stability margins for decentralized linear quadratic regulators (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Control Systems Letters, 7, 1778–1782, May 2023.abstract
It is well-known that linear quadratic regulators (LQR) enjoy guaranteed stability margins, whereas linear quadratic Gaussian regulators (LQG) do not. In this letter, we consider systems and compensators defined over directed acyclic graphs. In particular, there are multiple decision-makers, each with access to a different part of the global state. In this setting, the optimal LQR compensator is dynamic and is similar to a classical LQG compensator; its state can be interpreted as a filter that predicts unobserved parts of the plant state. We show that when sub-controller input costs are decoupled (but there is possible coupling between sub-controller state costs), the decentralized LQR compensator enjoys similar guaranteed stability margins to classical LQR. However, these guarantees disappear when cost coupling is introduced. - Agent-level optimal LQG control of dynamically decoupled systems with processing delays (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Conference on Decision and Control (CDC), 5980–5985, Dec 2020. (Jeju Island, Republic of Korea, virtual)abstract
We consider the problem of controlling a set of dynamically decoupled plants where the plants' subcontrollers communicate with each other according to a fixed and known network topology. We assume the communication to be instantaneous but there is a fixed processing delay associated with incoming transmissions. We provide explicit closed-form expressions for the optimal decentralized controller under these communication constraints and using standard LQG assumptions for the plants and cost function. Although this problem is convex, it is challenging due to the irrationality of continuous-time delays and the decentralized information-sharing pattern. We show that the optimal subcontrollers each have an observer-regulator architecture containing LTI and FIR blocks and we characterize the signals that subcontrollers should transmit to each other across the network. - Explicit agent-level optimal cooperative controllers for dynamically decoupled systems with output feedback (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Conference on Decision and Control (CDC), 8254–8259, Dec 2019. (Nice, France)abstract
We consider a dynamically decoupled network of agents each with a local output-feedback controller. We assume each agent is a node in a directed acyclic graph and the controllers share information along the edges in order to cooperatively optimize a global objective. We develop explicit state-space formulations for the jointly optimal networked controllers that highlight the role of the graph structure. Specifically, we provide generically minimal agent-level implementations of the local controllers along with intuitive interpretations of their states and the information that should be transmitted between them. - Optimal control for LQG systems on graphs—Part I: structural results (arXiv)
A. Nayyar and L. Lessard.abstract
In this two-part paper, we identify a broad class of decentralized output-feedback LQG systems for which the optimal control strategies have a simple intuitive estimation structure and can be computed efficiently. Roughly, we consider the class of systems for which the coupling of dynamics among subsystems and the inter-controller communication is characterized by the same directed graph. Furthermore, this graph is assumed to be a multitree, that is, its transitive reduction can have at most one directed path connecting each pair of nodes. In this first part, we derive sufficient statistics that may be used to aggregate each controller's growing available information. Each controller must estimate the states of the subsystems that it affects (its descendants) as well as the subsystems that it observes (its ancestors). The optimal control action for a controller is a linear function of the estimate it computes as well as the estimates computed by all of its ancestors. Moreover, these state estimates may be updated recursively, much like a Kalman filter. - Optimal decentralized state-feedback control with sparsity and delays (doi, arXiv)
A. Lamperski and L. Lessard. Automatica, 58, 143–151, Aug 2015.abstract
This work presents the solution to a class of decentralized linear quadratic state-feedback control problems, in which the plant and controller must satisfy the same combination of delay and sparsity constraints. Using a novel decomposition of the noise history, the control problem is split into independent subproblems that are solved using dynamic programming. The approach presented herein both unifies and generalizes many existing results. - Optimal control of two-player systems with output feedback (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Automatic Control, 60(8), 2129–2144, Aug 2015.abstract
In this article, we consider a fundamental decentralized optimal control problem, which we call the two-player problem. Two subsystems are interconnected in a nested information pattern, and output feedback controllers must be designed for each subsystem. Several special cases of this architecture have previously been solved, such as the state-feedback case or the case where the dynamics of both systems are decoupled. In this paper, we present a detailed solution to the general case. The structure of the optimal decentralized controller is reminiscent of that of the optimal centralized controller; each player must estimate the state of the system given their available information and apply static control policies to these estimates to compute the optimal controller. The previously solved cases benefit from a separation between estimation and control which allows one to compute the control and estimation gains separately. This feature is not present in general, and some of the gains must be solved for simultaneously. We show that computing the required coupled estimation and control gains amounts to solving a small system of linear equations. - Structural results for partially nested LQG systems over graphs (doi)
A. Nayyar and L. Lessard. American Control Conference (ACC), 5457–5464, Jul 2015. (Chicago, IL)abstract
We identify a broad class of decentralized output-feedback LQG systems for which the optimal control strategies have a simple and intuitive estimation structure. We consider cases for which the coupling of dynamics among subsystems and the inter-controller communication are characterized by the same directed graph. For the class of graphs known as multitrees, we show that each controller need only estimate the states of the subsystems it affects (its descendants) as well as the subsystems it observes (its ancestors). The optimal control action for each controller is a linear function of the estimate it computes and the estimates computed by its ancestors. Moreover, all state estimates may be updated recursively, much like a Kalman filter. - State-space solution to a minimum-entropy H-infinity optimal control problem with a nested information constraint (doi, arXiv)
L. Lessard. IEEE Conference on Decision and Control (CDC), 4026–4031, Dec 2014. (Los Angeles, CA)abstract
State-space formulas are derived for the minimum-entropy H-infinity controller when the plant and controller are constrained to be block-lower-triangular. Such a controller exists if and only if: the corresponding unstructured problem has a solution, a certain pair of coupled algebraic Riccati equations admits a mutually stabilizing fixed point, and a pair of spectral radius conditions is met. The controller's observer-based structure is also discussed, and a simple numerical approach for solving the coupled Riccati equations is presented. - Structural results and explicit solution for two-player LQG systems on a finite time horizon (doi, arXiv)
L. Lessard and A. Nayyar. IEEE Conference on Decision and Control (CDC), 6542–6549, Dec 2013. (Florence, Italy)abstract
It is well-known that linear dynamical systems with Gaussian noise and quadratic cost (LQG) satisfy a separation principle. Finding the optimal controller amounts to solving separate dual problems; one for control and one for estimation. For the discrete-time finite-horizon case, each problem is a simple forward or backward recursion. In this paper, we consider a generalization of the LQG problem in which there are two controllers. Each controller is responsible for one of two system inputs, but has access to different subsets of the available measurements. Our paper has three main contributions. First, we prove a fundamental structural result: sufficient statistics for the controllers can be expressed as conditional means of the global state. Second, we give explicit state-space formulae for the optimal controller. These formulae are reminiscent of the classical LQG solution with dual forward and backward recursions, but with the important difference that they are intricately coupled. Lastly, we show how these recursions can be solved efficiently, with computational complexity comparable to that of the centralized problem. - A separation principle for decentralized state-feedback optimal control (doi)
L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 528–534, Oct 2013. (Monticello, IL)abstract
A cooperative control problem is considered in which dynamically decoupled subsystems must control their own states through state feedback in order to optimize a global quadratic cost. The states of the subsystems are coupled only through the cost function and correlated external disturbances. The architecture is truly decentralized; no communication between subsystems or their controllers is permitted. The main result of this paper is that the optimal decentralized controller satisfies a new separation principle that is strikingly similar to the celebrated result from centralized optimal control theory, but does not appear to follow from it. Roughly speaking, the optimal decentralized control strategy for each subsystem is the product of a static control gain and a global state estimate, and each can be separately computed. - On structured realizability and stabilizability of linear systems (doi, arXiv)
L. Lessard, M. Kristalny, and A. Rantzer. American Control Conference (ACC), 5694–5700, Jun 2013. (Washington, D.C.)abstract
We study the notion of structured realizability for linear systems dened over graphs. A stabilizable and detectable realization is structured if the state-space matrices inherit the sparsity pattern of the adjacency matrix of the associated graph. In this paper, we demonstrate that not every structured transfer matrix has a structured realization and we reveal the practical meaning of this fact. We also uncover a close connection between the structured realizability of a plant and whether the plant can be stabilized by a structured controller. In particular, we show that a structured stabilizing controller can only exist when the plant admits a structured realization. Finally, we give a parameterization of all structured stabilizing controllers and show that they always have structured realizations. - Decentralized LQG control of systems with a broadcast architecture (doi)
L. Lessard. IEEE Conference on Decision and Control (CDC), 6241–6246, Dec 2012. (Maui, HI)abstract
In this paper, we consider dynamical subsystems interconnected in a broadcast architecture. In the broadcast-out case, the root node can affect several leaf nodes, but the leaf nodes do not affect any other nodes. Each subsystem is locally controlled via output feedback, and the controllers can communicate according to a structure that parallels the dynamic coupling between subsystems. Explicit state-space realizations for the optimal controllers are derived using a spectral factorization approach. An interpretation of the controller states is also provided in terms of optimal state estimators. We also address the dual broadcast-in case, where there is a single leaf node affected by multiple root nodes. - Optimal control of a fully decentralized quadratic regulator (doi)
L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 48–54, Oct 2012. (Monticello, IL)abstract
In this paper, we consider a fully decentralized control problem with two dynamically decoupled agents. The objective is to design a state-feedback controller for each agent such that a global quadratic cost is minimized. No communication, explicit or implicit, is permitted between the agents. However, the performance of the agents is coupled via the cost function as well as the process noise. We provide an explicit state-space construction of the optimal controllers, showing that the optimal controllers are dynamic, where the number of states depends on the joint covariance matrix of the process noise. The key step is a novel decomposition of the noise covariance matrix, which allows the convex program associated with the controller synthesis to be split into simpler problems and thereby solved. - Optimal state-feedback control under sparsity and delay constraints (doi)
A. Lamperski and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 204–209, Sep 2012. (Santa Barbara, CA)abstract
This paper presents the solution to a general decentralized state-feedback problem, in which the plant and controller must satisfy the same combination of delay constraints and sparsity constraints. The control problem is decomposed into independent subproblems, which are solved by dynamic programming. In special cases with only sparsity or only delay constraints, the controller reduces to existing solutions. - Optimal controller synthesis for the decentralized two-player problem with output feedback (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 6314–6321, Jun 2012. (Montréal, Canada)abstract
In this paper, we present a controller synthesis algorithm for a decentralized control problem. We consider an architecture in which there are two interconnected linear subsystems. Both controllers seek to optimize a global quadratic cost, despite having access to different subsets of the available measurements. Many special cases of this problem have previously been solved, most notably the state-feedback case. The generalization to output-feedback is nontrivial, as the classical separation principle does not hold. Herein, we present the first explicit state-space realization for an optimal controller for the general two-player problem. - A state-space solution to the two-player decentralized optimal control problem (doi)
L. Lessard and S. Lall. Allerton Conference on Communication, Control, and Computing (Allerton), 1559–1564, Sep 2011. (Monticello, IL)abstract
In this paper, we present an explicit state-space solution to the two-player decentralized optimal control problem. In this problem, there are two interconnected linear systems that seek to optimize a global quadratic cost. Both controllers perform output feedback, but they have access to different subsets of the available measurements. The optimal controller, which was not previously known, has a state dimension equal to twice the state dimension of the original system.
Tractability in the Presence of Information Constraints
- Convexity of decentralized controller synthesis (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Automatic Control, 61(10), 3122–3127, Oct 2016.abstract
In decentralized control problems, a standard approach is to specify the set of allowable decentralized controllers as a closed subspace of linear operators. This then induces a corresponding set of Youla parameters. Previous work has shown that quadratic invariance of the controller set implies that the set of Youla parameters is convex. In this paper, we prove the converse. We thereby show that the only decentralized control problems for which the set of Youla parameters is convex are those which are quadratically invariant. We further show that under additional assumptions, quadratic invariance is necessary and sufficient for the set of achievable closed-loop maps to be convex. We give two versions of our results. The first applies to bounded linear operators on a Banach space and the second applies to (possibly unstable) causal LTI systems in discrete or continuous time. - An algebraic approach to the control of decentralized systems (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Control of Network Systems, 1(4), 308–317, Dec 2014.abstract
Optimal decentralized controller design is notoriously difficult, but recent research has identified large subclasses of such problems that may be convexified and thus are amenable to solution via efficient numerical methods. One recently discovered sufficient condition for convexity is \emph{quadratic invariance} (QI). Despite the simple algebraic characterization of QI, which relates the plant and controller maps, proving convexity of the set of achievable closed-loop maps requires tools from functional analysis. In this work, we present a new formulation of quadratic invariance that is purely algebraic. While our results are similar in flavor to those from traditional QI theory, they do not follow from that body of work. Furthermore, they are applicable to new types of systems that are difficult to treat using functional analysis. Examples discussed include rational transfer matrices, systems with delays, and multidimensional systems. - On structured realizability and stabilizability of linear systems (doi, arXiv)
L. Lessard, M. Kristalny, and A. Rantzer. American Control Conference (ACC), 5694–5700, Jun 2013. (Washington, D.C.)abstract
We study the notion of structured realizability for linear systems dened over graphs. A stabilizable and detectable realization is structured if the state-space matrices inherit the sparsity pattern of the adjacency matrix of the associated graph. In this paper, we demonstrate that not every structured transfer matrix has a structured realization and we reveal the practical meaning of this fact. We also uncover a close connection between the structured realizability of a plant and whether the plant can be stabilized by a structured controller. In particular, we show that a structured stabilizing controller can only exist when the plant admits a structured realization. Finally, we give a parameterization of all structured stabilizing controllers and show that they always have structured realizations. - Tractability of complex control systems (url)
L. Lessard. Ph.D. Thesis, Stanford University, Aug 2011.abstract
This thesis is divided into two main parts. In the first part, we consider the problem of efficiently computing wavefront estimates for use in adaptive optics hardware on ground-based telescopes. Our contribution is a warm-started single-iteration multigrid algorithm that performs as well as conventional vector-matrix-multiplication methods, but at a fraction of the computational cost. We used numerical simulations to compare our algorithm to a variety of other published methods, and validated our findings at the Palomar Observatory. In the second part, we consider feedback control subject to an information constraint. Using a novel algebraic framework, we are able to prove many structural results, including a new convexity result, in a natural and purely algebraic way. We also develop a new condition we call internal quadratic invariance, under which the controller synthesis can be cast as a convex optimization problem. This describes the most general class of tractable decentralized control problems known to date. Both parts of the thesis fit into the broader question of tractability of complex systems. In the first part we look at a practical example which is difficult because of the large number of sensors and actuators. In the second part, we look at decentralized control, which is difficult because of the non-classical information constraint. - Quadratic invariance is necessary and sufficient for convexity (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 5360–5362, Jun 2011. (San Francisco, CA)abstract
In decentralized control problems, a standard approach is to specify the set of allowable decentralized controllers as a closed subspace of linear operators. This then induces a corresponding set of of Youla parameters. Previous work has shown that quadratic invariance of the controller set implies that the the set of Youla parameters is convex. In this short note, we prove the converse. We thereby show that the only decentralized control problems for which the set of Youla parameters is convex are those which are quadratically invariant. - An algebraic framework for quadratic invariance (doi)
L. Lessard and S. Lall. IEEE Conference on Decision and Control (CDC), 2698–2703, Dec 2010. (Atlanta, GA)abstract
In this paper, we present a general algebraic framework for analysing decentralized control systems. We consider systems defined by linear fractional functions over a commutative ring. This provides a general algebraic formulation and proof of the main results of quadratic invariance, as well as naturally covering rational multivariable systems, systems with delays, and multidimensional systems. The approach extends to the extended class of internally quadratically invariant systems. - Internal quadratic invariance and decentralized control (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 5596–5601, Jun 2010. (Baltimore, MD)abstract
For decentralized control systems with quadratically invariant information constraints, the set of achievable closed-loop maps is affine, and thus the associated minimum-norm controller synthesis problem is amenable to a convex programming approach. In this paper, we show that a strictly broader class of problems we call internally quadratically invariant, also yields an affine set of achievable closed-loop maps. We treat systems represented by rational as well as proper rational transfer functions and present an illustrative example. - Reduction of decentralized control problems to tractable representations (doi)
L. Lessard and S. Lall. IEEE Conference on Decision and Control (CDC), 1621–1626, Dec 2009. (Shanghai, China)abstract
For decentralized control problems with quadratically invariant information constraints, the optimal controller may be found efficiently. In this paper, we show that there are systems which are not quadratically invariant but reduce to systems that are. We call the requisite property internal quadratic invariance. We present an associated reduction procedure, and illustrate our method with examples.
Adaptive Optics
- Tractability of complex control systems (url)
L. Lessard. Ph.D. Thesis, Stanford University, Aug 2011.abstract
This thesis is divided into two main parts. In the first part, we consider the problem of efficiently computing wavefront estimates for use in adaptive optics hardware on ground-based telescopes. Our contribution is a warm-started single-iteration multigrid algorithm that performs as well as conventional vector-matrix-multiplication methods, but at a fraction of the computational cost. We used numerical simulations to compare our algorithm to a variety of other published methods, and validated our findings at the Palomar Observatory. In the second part, we consider feedback control subject to an information constraint. Using a novel algebraic framework, we are able to prove many structural results, including a new convexity result, in a natural and purely algebraic way. We also develop a new condition we call internal quadratic invariance, under which the controller synthesis can be cast as a convex optimization problem. This describes the most general class of tractable decentralized control problems known to date. Both parts of the thesis fit into the broader question of tractability of complex systems. In the first part we look at a practical example which is difficult because of the large number of sensors and actuators. In the second part, we look at decentralized control, which is difficult because of the non-classical information constraint. - Experimental validation of single-iteration multigrid wavefront reconstruction at the Palomar Observatory (doi)
L. Lessard, D. MacMynowski, M. West, A. Bouchez, and S. Lall. Optics Letters, 33(18), 2047–2049, Sep 2008.abstract
Single-iteration multigrid (SIMG) wavefront reconstruction schemes were implemented and validated on the adaptive optics system at the Hale 5.1m telescope at the Palomar Observatory. Results indicate that even the simplest such method produces a performance indistinguishable from that of the standard least-squares reconstructor for both bright and dim guide stars. SIMG provides a dramatic reduction in computational cost when compared to vector–matrix multiplication and can be implemented in parallel, making it the obvious choice for reconstruction in future large-scale adaptive optics systems. - Warm-started wavefront reconstruction for adaptive optics (doi)
L. Lessard, M. West, D. MacMynowski, and S. Lall. Journal of the Optical Society of America A, 25(5), 1147–1155, May 2008.abstract
Future extreme adaptive optics (ExAO) systems have been suggested with up to 105 sensors and actuators. We analyze the computational speed of iterative reconstruction algorithms for such large systems. We compare a total of 15 different scalable methods, including multigrid, preconditioned conjugate-gradient, and several new variants of these. Simulations on a 128×128 square sensor/actuator geometry using Taylor frozen-flow dynamics are carried out using both open-loop and closed-loop measurements, and algorithms are compared on a basis of the mean squared error and floating-point multiplications required. We also investigate the use of warm starting, where the most recent estimate is used to initialize the iterative scheme. In open-loop estimation or pseudo-open-loop control, warm starting provides a significant computational speedup; almost every algorithm tested converges in one iteration. In a standard closed-loop implementation, using a single iteration per time step, most algorithms give the minimum error even in cold start, and every algorithm gives the minimum error if warm started. The best algorithm is therefore the one with the smallest computational cost per iteration, not necessarily the one with the best quasi-static performance.
Thin Film Optics
- Design considerations for the enhancement of human color vision by breaking binocular redundancy (doi, arXiv)
B. Gundlach, M. Frising, A. Shahsafi, G. Vershbow, C. Wan, J. Salman, B. Rokers, L. Lessard, and M. Kats. Scientific Reports, 8, 11971, Aug 2018.abstract
To see color, the human visual system combines the response of three types of cone cells in the retina—a compressive process that discards a significant amount of spectral information. Here, we present designs based on thin-film optical filters with the goal of enhancing human color vision by breaking its inherent binocular redundancy, providing different spectral content to each eye. We fabricated a set of optical filters that “splits” the response of the short-wavelength cone between the two eyes in individuals with typical trichromatic vision, simulating the presence of approximately four distinct cone types. Such an increase in the number of effective cone types can reduce the prevalence of metamers—pairs of distinct spectra that resolve to the same tristimulus values. This technique may result in an enhancement of spectral perception, with applications ranging from camouflage detection and anti-counterfeiting to new types of artwork and data visualization.
Color Modeling
- Unifying effects of direct and relational associations for visual communication (doi, arXiv)
M.A. Schoenlein, J. Campos, K.J. Lande, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 29(1), 385–395, Jan 2023.abstract
People have expectations about how colors map to concepts in visualizations, and they are better at interpreting visualizations that match their expectations. Traditionally, studies on these expectations (inferred mappings) distinguished distinct factors relevant for visualizations of categorical vs. continuous information. Studies on categorical information focused on direct associations (e.g., mangos are associated with yellows) whereas studies on continuous information focused on relational associations (e.g., darker colors map to larger quantities; dark-is-more bias). We unite these two areas within a single framework of assignment inference. Assignment inference is the process by which people infer mappings between perceptual features and concepts represented in encoding systems. Observers infer globally optimal assignments by maximizing the "merit," or "goodness," of each possible assignment. Previous work on assignment inference focused on visualizations of categorical information. We extend this approach to visualizations of continuous data by (a) broadening the notion of merit to include relational associations and (b) developing a method for combining multiple (sometimes conflicting) sources of merit to predict people's inferred mappings. We developed and tested our model on data from experiments in which participants interpreted colormap data visualizations, representing fictitious data about environmental concepts (sunshine, shade, wild fire, ocean water, glacial ice). We found both direct and relational associations contribute independently to inferred mappings. These results can be used to optimize visualization design to facilitate visual communication. - Context matters: A theory of semantic discriminability for perceptual encoding systems (doi, arXiv)
K. Mukherjee, B. Yin, B.E. Sherman, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 28(1), 697–706, Jan 2022.abstract
People's associations between colors and concepts influence their ability to interpret the meanings of colors in information visualizations. Previous work has suggested such effects are limited to concepts that have strong, specific associations with colors. However, although a concept may not be strongly associated with any colors, its mapping can be disambiguated in the context of other concepts in an encoding system. We articulate this view in semantic discriminability theory, a general framework for understanding conditions determining when people can infer meaning from perceptual features. Semantic discriminability is the degree to which observers can infer a unique mapping between visual features and concepts. Semantic discriminability theory posits that the capacity for semantic discriminability for a set of concepts is constrained by the difference between the feature-concept association distributions across the concepts in the set. We define formal properties of this theory and test its implications in two experiments. The results show that the capacity to produce semantically discriminable colors for sets of concepts was indeed constrained by the statistical distance between color-concept association distributions (Experiment 1). Moreover, people could interpret meanings of colors in bar graphs insofar as the colors were semantically discriminable, even for concepts previously considered "non-colorable" (Experiment 2). The results suggest that colors are more robust for visual communication than previously thought. - Semantic discriminability for visual communication (doi, arXiv)
K.B. Schloss, Z. Leggon, and L. Lessard. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1022–1031, Feb 2021.abstract
To interpret information visualizations, observers must determine how visual features map onto concepts. First and foremost, this ability depends on perceptual discriminability; observers must be able to see the difference between different colors for those colors to communicate different meanings. However, the ability to interpret visualizations also depends on semantic discriminability, the degree to which observers can infer a unique mapping between visual features and concepts, based on the visual features and concepts alone (i.e., without help from verbal cues such as legends or labels). Previous evidence suggested that observers were better at interpreting encoding systems that maximized semantic discriminability (maximizing association strength between assigned colors and concepts while minimizing association strength between unassigned colors and concepts), compared to a system that only maximized color-concept association strength. However, increasing semantic discriminability also resulted in increased perceptual distance, so it is unclear which factor was responsible for improved performance. In the present study, we conducted two experiments that tested for independent effects of semantic distance and perceptual distance on semantic discriminability of bar graph data visualizations. Perceptual distance was large enough to ensure colors were more than just noticeably different. We found that increasing semantic distance improved performance, independent of variation in perceptual distance, and when these two factors were uncorrelated, responses were dominated by semantic distance. These results have implications for navigating trade-offs in color palette design optimization for visual communication. - The relation between color and spatial structure for interpreting colormap data visualizations (doi)
S.C. Sibrel, R. Rathore, L. Lessard, and K.B. Schloss. Journal of Vision, 20(12), 7, Nov 2020.abstract
Interpreting colormap visualizations requires determining how dimensions of color in visualizations map onto quantities in data. People have color-based biases that influence their interpretations of colormaps, such as a dark-is-more bias---darker colors map to larger quantities. Previous studies of color-based biases focused on colormaps with weak data spatial structure, but color-based biases may not generalize to colormaps with strong data spatial structure, like "hotspots" typically found in weather maps and neuroimaging brain maps. There may be a hotspot-is-more bias to infer that colors within hotspots represent larger quantities, which may override the dark-is-more bias. We tested this possibility in four experiments. Participants saw colormaps with hotspots and a legend that specified the color-quantity mapping. Their task was to indicate which side of the colormap depicted larger quantities (left/right). We varied whether the legend specified dark-more mapping or light-more mapping across trials and operationalized a dark-is-more bias as faster response time (RT) when the legend specified dark-more mapping. Experiment 1 demonstrated robust evidence for the dark-is-more bias, without evidence for a hotspot-is-more bias. Experiments 2 to 4 suggest that a hotspot-is-more bias becomes relevant when hotspots are a statistically reliable cue to "more" (i.e., the locus of larger quantities) and when hotspots are more perceptually pronounced. Yet, comparing conditions in which the hotspots were "more," RTs were always faster for dark hotspots than light hotspots. Thus, in the presence of strong spatial cues to the locus of larger quantities, color-based biases still influenced interpretations of colormap data visualizations. - Estimating color-concept associations from image statistics (doi, arXiv)
R. Rathore, Z. Leggon, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 26(1), 1226–1235, Jan 2020.abstract
To interpret the meanings of colors in visualizations of categorical information, people must determine how distinct colors correspond to different concepts. This process is easier when assignments between colors and concepts in visualizations match people’s expectations, making color palettes semantically interpretable. Efforts have been underway to optimize color palette design for semantic interpretablity, but this requires having good estimates of human color-concept associations. Obtaining these data from humans is costly, which motivates the need for automated methods. We developed and evaluated a new method for automatically estimating color-concept associations in a way that strongly correlates with human ratings. Building on prior studies using Google Images, our approach operates directly on Google Image search results without the need for humans in the loop. Specifically, we evaluated several methods for extracting raw pixel content of the images in order to best estimate color-concept associations obtained from human ratings. The most effective method extracted colors using a combination of cylindrical sectors and color categories in color space. We demonstrate that our approach can accurately estimate average human color-concept associations for different fruits using only a small set of images. The approach also generalizes moderately well to more complicated recycling-related concepts of objects that can appear in any color. - Color inference in visual communication: The meaning of colors in recycling (doi)
K.B. Schloss, L. Lessard, C.S. Walmsley, and K. Foley. Cognitive Research: Principles and Implications, 3(5), 1–17, Feb 2018.abstract
People interpret abstract meanings from colors, which makes color a useful perceptual feature for visual communication. This process is complicated, however, because there is seldom a one-to-one correspondence between colors and meanings. One color can be associated with many different concepts (one-to-many mapping) and many colors can be associated with the same concept (many-to-one mapping). We propose that to interpret color-coding systems, people perform assignment inference to determine how colors map onto concepts. We studied assignment inference in the domain of recycling. Participants saw images of colored but unlabeled bins and were asked to indicate which bins they would use to discard different kinds of recyclables and trash. In Experiment 1, we tested two hypotheses for how people perform assignment inference. The local assignment hypothesis predicts that people simply match objects with their most strongly associated color. The global assignment hypothesis predicts that people also account for the association strengths between all other objects and colors within the scope of the color-coding system. Participants discarded objects in bins that optimized the color-object associations of the entire set, which is consistent with the global assignment hypothesis. This sometimes resulted in discarding objects in bins whose colors were weakly associated with the object, even when there was a stronger associated option available. In Experiment 2, we tested different methods for encoding color-coding systems and found that people were better at assignment inference when color sets simultaneously maximized the association strength between assigned color-object parings while minimizing associations between unassigned pairings. Our study provides an approach for designing intuitive color-coding systems that facilitate communication through visual media such as graphs, maps, signs, and artifacts. - Modeling color preference using color space metrics (doi)
K.B. Schloss, L. Lessard, C. Racey, and A.C. Hurlbert. Vision Research, 151, 99–116, Jul 2017.abstract
Studying color preferences provides a means to discover how perceptual experiences map onto cognitive and affective judgments. A challenge is finding a parsimonious way to describe and predict patterns of color preferences, which are complex with rich individual differences. One approach has been to model color preferences using factors from metric color spaces to establish direct correspondences between dimensions of color and preference. Prior work established that substantial, but not all, variance in color preferences could be captured by weights on color space dimensions using multiple linear regression. The question we address here is whether model fits may be improved by using different color metric specifications. We therefore conducted a large-scale analysis of color space models, and focused in-depth analysis on models that differed in color space (cone-contrast vs. CIELAB), coordinate system within the color space (Cartesian vs. cylindrical), and factor degrees (1st degree only, or 1st and 2nd degree). We used k-fold cross validation to avoid over-fitting the data and to ensure fair comparisons across models. The best model was the 2nd-harmonic Lch model ("LabC Cyl2"). Specified in CIELAB space, it included 1st and 2nd harmonics of hue (capturing opponency in hue preferences and simultaneous liking/disliking of both hues on an opponent axis, respectively), lightness, and chroma. These modeling approaches can be used to characterize and compare patterns for group averages and individuals in future datasets on color preference, or other measures in which correspondences between color appearance and cognitive or affective judgments may exist.
Machine Teaching and Data Poisoning
- Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning (url, arXiv)
A. Bose, L. Lessard, M. Fazel, and K. Dvijotham. International Conference on Artificial Intelligence and Statistics (AISTATS), May 2025. (Mai Khao, Thailand)abstract
The rise of foundation models fine-tuned on human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. Existing research on provable certified robustness against data poisoning attacks primarily focuses on certifying robustness for static adversaries who modify a fraction of the dataset used to train the model before the training algorithm is applied. In practice, particularly when learning from human feedback in an online sense, adversaries can observe and react to the learning process and inject poisoned samples that optimize adversarial objectives better than when they are restricted to poisoning a static dataset once, before the learning algorithm is applied. Indeed, it has been shown in prior work that online dynamic adversaries can be significantly more powerful than static ones. We present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean estimation problem and binary classification problems and outline directions for extending this in further work. The code to implement our certificates and replicate our results is available at https://github.com/Avinandan22/Certified-Robustness - Certifying robustness to adaptive data poisoning (url)
A. Bose, M. Udell, L. Lessard, M. Fazel, and K. Dvijotham. FoRLaC workshop at the International Conference on Machine Learning (ICML), Jul 2024. (Vienna, Austria)abstract
The rise of foundational models fine-tuned with human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. While existing research focuses on certifying robustness for static adversaries acting on offline datasets, dynamic attack algorithms have shown to be more effective. Relevant for models with periodic updates where an adversary can adapt based on the algorithm's behavior, such as those in RLHF, we present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean-estimation problem. - Online data poisoning attacks (url, arXiv)
X. Zhang, X. Zhu, and L. Lessard. Conference on Learning for Dynamics and Control (L4DC), 201–210, Jun 2020. (Berkeley, CA, virtual)abstract
We study data poisoning attacks in the online setting where training items arrive sequentially, and the attacker may perturb the current item to manipulate online learning. Importantly, the attacker has no knowledge of future training items nor the data generating distribution. We formulate online data poisoning attack as a stochastic optimal control problem, and solve it with model predictive control and deep reinforcement learning. We also upper bound the suboptimality suffered by the attacker for not knowing the data generating distribution. Experiments validate our control approach in generating near-optimal attacks on both supervised and unsupervised learning tasks. - An optimal control approach to sequential machine teaching (url, arXiv)
L. Lessard, X. Zhang, and X. Zhu. International Conference on Artificial Intelligence and Statistics (AISTATS), 2495–2503, Apr 2019. (Naha, Japan)abstract
Given a sequential learning algorithm and a target model, sequential machine teaching aims to find the shortest training sequence to drive the learning algorithm to the target model. We present the first principled way to find such shortest training sequences. Our key insight is to formulate sequential machine teaching as a time-optimal control problem. This allows us to solve sequential teaching by leveraging key theoretical and computational tools developed over the past 60 years in the optimal control community. Specifically, we study the Pontryagin Maximum Principle, which yields a necessary condition for optimality of a training sequence. We present analytic, structural, and numerical implications of this approach on a case study with a least-squares loss function and gradient descent learner. We compute optimal training sequences for this problem, and although the sequences seem circuitous, we find that they can vastly outperform the best available heuristics for generating training sequences.
Aerospace Applications
- Near-optimal constrained feedback control of nonlinear systems via approximate HJB and control barrier functions (arXiv)
M.A. Shahraki and L. Lessard. Preprint.abstract
This paper presents a two-stage framework for constrained near-optimal feedback control of input-affine nonlinear systems. An approximate value function for the unconstrained control problem is computed offline by solving the Hamilton–Jacobi–Bellman equation. Online, a quadratic program is solved that minimizes the associated approximate Hamiltonian subject to safety constraints imposed via control barrier functions. Our proposed architecture decouples performance from constraint enforcement, allowing constraints to be modified online without recomputing the value function. Validation on a linear 2-state 1D hovercraft and a nonlinear 9-state spacecraft attitude control problem demonstrates near-optimal performance relative to open-loop optimal control benchmarks and superior performance compared to control Lyapunov function-based controllers. - Spacecraft attitude control under reaction wheel constraints using control Lyapunov and control barrier functions
(doi, arXiv)
M.A. Shahraki and L. Lessard. American Control Conference (ACC), 940–945, Jul 2025. (Denver, CO)abstract
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control. - Orthonormal polynomial bases for airfoil design (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. IEEE Aerospace Conference (AeroConf), 1–9, Mar 2024. (Big Sky, MT)abstract
Shape parameterization plays an important role in the process of Aerodynamic Shape Optimization (ASO). A good parameterization allows the optimizer to easily explore the design space at multiple scales, while not adversely affecting the convergence or accuracy of the solution. Over the last several decades, a wide variety of methods have been proposed and characterized for representing aerodynamic shapes, including various polynomial representations, classes of splines, and radial basis functions. Our research into the application of convex optimization techniques to aerospace design has motivated us to revisit and expand these techniques, particularly for those with convex representations. For such parameterizations, when coupled with convex objective functions and constraints, convergence is guaranteed to globally optimal solutions in polynomial time. This guarantee allows for comparison of the results obtained with different shape representations without potentially adverse complications related to selection of initial conditions, local optima, or noisy gradients. In the course of this work we make several contributions. First, we provide orthonormal extensions to several common polynomial bases. Second, we demonstrate efficient objective function representation for supersonic drag minimization problems when employing an integral form of such orthonormal bases. Finally, we develop an analytic solution to a sample benchmark problem and compare the solutions offered by a finite linear combination of design variables with basis functions, to the underlying continuous result. - Shape optimization of hypersonic lifting vehicles via convex relaxation (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA Aviation Forum (AVIATION), 1–12, Jun 2023. (San Diego, CA)abstract
In this paper, we extend our previous approach to the conceptual design of hypersonic aerospace vehicles, using convex optimization, to include lift as a performance indicator and constraint. Our approach employs the transformation of the lift coefficient formula to a concave function, by change of variables. This results in design problems featuring a convex objection function with a single nonconvex constraint, which we address using convex relaxation. Doing so allows us to approach a richer space of design problems which now include minimum lift constraints and treatment of lift to drag ratio within the objective function. We also apply numerical integration directly to the continuous differentials of both lift and drag, versus the panelized approach of our previous work. We demonstrate this approach on problems incorporating lift and drag, and also demonstrate a bilevel optimization problem to obtain a globally optimal maximum lift to drag coefficient vehicle. - Convex optimization of low observability hypersonic vehicles (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. IEEE Aerospace Conference (AeroConf), 1–11, Mar 2023. (Big Sky, MT)abstract
In this paper, we propose an approach to the conceptual design of high speed aerospace vehicles that addresses the coupled behavior of hypersonic aerodynamics and radar cross section. Our approach employs convex optimization, a branch of optimization theory that guarantees global optima for problems expressed with convex objective functions and constraints, combined with cubic splines as cross sectional representations. We demonstrate the process of creating convex surrogates using piecewise linear functions and apply these objective functions to useful test cases, employing a mixture of convex constraints on geometry. We also provide comparisons on the ability to converge to global optima between this type of convex optimization problem to a nonconvex, sequential quadratic programming solver. - Convex optimization of thin airfoils using cubic splines (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA SciTech Forum (SCITECH), 1–15, Jan 2023. (National Harbor, MD)abstract
Convex optimization offers an attractive approach to the physical design of aerospace vehicles due to its ability to quickly identify global optima to relatively complex problems. Building on previous work, we show a significant improvement in optimal supersonic drag coefficient for airfoils parameterized by cubic splines, versus cubic polynomials. In particular we derive convex expressions for the supersonic lift and drag coefficients of thin airfoils expressed as cubic splines, as well as subsonic lift and moment coefficients for the same. We compare the results of globally optimal designs parameterized by cubic polynomials and cubic splines, showing improvements in drag performance of approximately 20-30%. These designs are computed typically in single digit seconds using commodity hardware and open source software tools. - A convex optimization approach to thin airfoil design (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA Aviation Forum (AVIATION), 1–15, Jun 2022. (Chicago, IL)abstract
Convex optimization techniques can find global optima in polynomial time for problems with convex objective functions and constraints. In this paper, we demonstrate that objective functions involving lift, drag, and moment coefficients may be represented as convex functions when modeled using thin-airfoil theory. Additionally, we describe a rich set of constraints that may be formulated using convex representations. Finally, we apply these methods to the conceptual design of airfoils defined by cubic polynomials. In doing so, we demonstrate the ability to rapidly obtain globally-optimal, up to the limits of parameterization and physical model fidelity, airfoil shapes that maximize supersonic lift-to-drag ratio, given constraints on subsonic lift and moment coefficients, as well as various geometric constraints. Furthermore, we show the ability to expand this method to a bi-level optimization problem, with a single nonconvex variable, identifying a global solution for problems involving the placement of an internal payload. These problems, with up to one thousand constraints, typically run in single-digit seconds on commodity hardware.
PDE Systems
- Information structures of the Kalman filter for the elastic wave equation (doi)
J. Arbelaiz, E. Jensen, B. Bamieh, A.E. Hosoi, A. Jadbabaie, and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 1–6, Jul 2022. (Zürich, Switzerland)abstract
We study the Kalman Filter for the linear elastic wave equation over the real line with spatially distributed partial state measurements. The dynamics of the filter are described by a spatial convolution operator with asymptotic exponential spatial decay rate. This decay rate dictates how measurements from different spatial locations must be exchanged to implement the filter: faster spatial decay implies local measurements are more relevant and the filter is more "decentralized"; slower decay implies farther measurements also become relevant and the filter is more "centralized". Using dimensional analysis, we demonstrate that this decay rate is a function of one dimensionless group defined from system parameters, such as wave speed and noise variances. We find a critical value of such dimensionless group for which the Kalman Filter is completely decentralized.
Motion Planning
- Model predictive planning: Trajectory planning in obstruction-dense environments for low-agility aircraft (doi, arXiv)
M.T. Wallace, B. Streetman, and L. Lessard. IEEE Conference on Decision and Control (CDC), 8288–8293, Dec 2024. (Milan, Italy)abstract
We present Model Predictive Planning (MPP), a trajectory planner for low-agility vehicles such as a fixed-wing aircraft to navigate obstacle-laden environments. MPP consists of (1) a multi-path planning procedure that identifies candidate paths, (2) a raytracing procedure that generates linear constraints around these paths to enforce obstacle avoidance, and (3) a convex quadratic program that finds a feasible trajectory within these constraints if one exists. Low-agility aircraft cannot track arbitrary paths, so refining a given path into a trajectory that respects the vehicle's limited maneuverability and avoids obstacles often leads to an infeasible optimization problem. The critical feature of MPP is that it efficiently considers multiple candidate paths during the refinement process, thereby greatly increasing the chance of finding a feasible and trackable trajectory. We demonstrate the effectiveness of MPP on a longitudinal aircraft model.
Multi-Agent Systems
- Stealthy optimal range-sensor placement for target localization (doi, arXiv)
M.H. Yoosefian Nooshabadi, R. Sipahi, and L. Lessard. IEEE Control Systems Letters, 8, 2763–2768, Dec 2024.abstract
We study a stealthy range-sensor placement problem where a set of range sensors are to be placed with respect to targets to effectively localize them while maintaining a degree of stealthiness from the targets. This is an open and challenging problem since two competing objectives must be balanced: (a) optimally placing the sensors to maximize their ability to localize the targets and (b) minimizing the information the targets gather regarding the sensors. We provide analytical solutions in 2D for the case of any number of sensors that localize two targets.
Adaptive Algorithms
- Adaptive acceleration without strong convexity priors or restarts (url, arXiv)
J.V. Cavalcanti, L. Lessard, and A.C. Wilson. OPT2025 workshop at the Conference on Neural Information Processing Systems (NeurIPS), Dec 2025. (San Diego, CA)abstract
In this paper, we propose a parameter-free algorithm for smooth and strongly convex objective problems called NAG-free. To our knowledge, NAG-free is the first adaptive algorithm capable of directly estimating the strong convexity parameter without priors or resorting to restart schemes. We prove that NAG-free converges globally at least as fast gradient descent, and achieves accelerated convergence locally around the minimum if the Hessian is locally smooth at the minimum and other mild additional assumptions hold. We present real-world experiments in which NAG-free is competitive with restart schemes and adapts to better local curvature conditions. - Adaptive backtracking line search (url, arXiv)
J.V. Cavalcanti, L. Lessard, and A.C. Wilson. International Conference on Learning Representations (ICLR), Apr 2025. (Singapore)abstract
Backtracking line search is foundational in numerical optimization. The basic idea is to adjust the step size of an algorithm by a constant factor until some chosen criterion (e.g. Armijo, Goldstein, Descent Lemma) is satisfied. We propose a new way for adjusting step sizes, replacing the constant factor used in regular backtracking with one that takes into account the degree to which the chosen criterion is violated, without additional computational burden. For convex problems, we prove adaptive backtracking requires fewer adjustments to produce a feasible step size than regular backtracking does for two popular line search criteria: the Armijo condition and the descent lemma. For nonconvex smooth problems, we additionally prove adaptive backtracking enjoys the same guarantees of regular backtracking. Finally, we perform a variety of experiments on over fifteen real world datasets, all of which confirm that adaptive backtracking often leads to significantly faster optimization.
Estimation
- State estimation for linear systems with non-Gaussian measurement noise via dynamic programming (doi, arXiv)
M.H. Yoosefian Nooshabadi and L. Lessard. IEEE Conference on Decision and Control (CDC), 5612–5617, Dec 2025. (Rio de Janeiro, Brazil)abstract
We propose a new recursive estimator for linear dynamical systems under Gaussian process noise and non-Gaussian measurement noise. Specifically, we develop an approximate maximum a posteriori (MAP) estimator using dynamic programming and tools from convex analysis. Our approach does not rely on restrictive noise assumptions and employs a Bellman-like update instead of a Bayesian update. Our proposed estimator is computationally efficient, with only modest overhead compared to a standard Kalman filter. Simulations demonstrate that our estimator achieves lower root mean squared error (RMSE) than the Kalman filter and has comparable performance to state-of-the-art estimators, while requiring significantly less computational power.
Preprints
- Near-optimal constrained feedback control of nonlinear systems via approximate HJB and control barrier functions (arXiv)
M.A. Shahraki and L. Lessard. Preprint.abstract
This paper presents a two-stage framework for constrained near-optimal feedback control of input-affine nonlinear systems. An approximate value function for the unconstrained control problem is computed offline by solving the Hamilton–Jacobi–Bellman equation. Online, a quadratic program is solved that minimizes the associated approximate Hamiltonian subject to safety constraints imposed via control barrier functions. Our proposed architecture decouples performance from constraint enforcement, allowing constraints to be modified online without recomputing the value function. Validation on a linear 2-state 1D hovercraft and a nonlinear 9-state spacecraft attitude control problem demonstrates near-optimal performance relative to open-loop optimal control benchmarks and superior performance compared to control Lyapunov function-based controllers. - Algebraic characterization of equivalence between optimization algorithms (arXiv)
L. Lessard and M. Udell. Preprint.abstract
When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize equivalence between algorithms that use different oracles that are related via a linear fractional transformation. Prominent examples include linear transformations and function conjugation. - Stochastic LQR design with disturbance preview (arXiv)
J. Liu, L. Lessard, and P. Seiler. Preprint.abstract
This paper considers the discrete-time, stochastic LQR problem with p steps of disturbance preview information where p is finite. We first derive the solution for this problem on a finite horizon with linear, time-varying dynamics and time-varying costs. Next, we derive the solution on the infinite horizon with linear, time-invariant dynamics and time-invariant costs. Our proofs rely on the well-known principle of optimality. We provide an independent proof for the principle of optimality that relies only on nested information structure. Finally, we show that the finite preview controller converges to the optimal noncausal controller as the preview horizon p tends to infinity. We also provide a simple example to illustrate both the finite and infinite horizon results. - The speed-robustness trade-off for first-order methods with additive gradient noise (arXiv)
B. Van Scoy and L. Lessard. Preprint.abstract
We study the trade-off between convergence rate and sensitivity to stochastic additive gradient noise for first-order optimization methods. Ordinary Gradient Descent (GD) can be made fast-and-sensitive or slow-and-robust by increasing or decreasing the stepsize, respectively. However, it is not clear how such a trade-off can be navigated when working with accelerated methods such as Polyak's Heavy Ball (HB) or Nesterov's Fast Gradient (FG) methods, or whether any of these methods can achieve an optimal trade-off. We consider three classes of functions: (1) strongly convex quadratics, (2) smooth strongly convex functions, and (3) nonconvex functions that satisfy a weak notion of strong convexity. For each function class, we present a tractable way to compute convergence rate and sensitivity to additive gradient noise for a broad family of first-order methods, and we present near-Pareto-optimal algorithm designs. Each design consists of a simple analytic update rule with two states of memory, similar to HB and FG. Moreover, each design has a scalar tuning parameter that explicitly trades off convergence rate and sensitivity to additive gradient noise. When tuned as aggressively as possible, our proposed algorithms recover the algorithms with fastest-known convergence rates for each function class. When tuned to be more robust, our algorithms are novel and provide a practical way to control noise sensitivity while maintaining the fastest possible convergence rate. We validate the performance and near-optimality of our designs through numerous numerical simulations. - An automatic system to detect equivalence between algorithms (arXiv)
S. Zhao, L. Lessard, and M. Udell.abstract
When are two algorithms the same? How can we be sure a recently proposed algorithm is novel, and not a minor twist on an existing method? In this paper, we present a framework for reasoning about equivalence between a broad class of iterative algorithms, with a focus on algorithms designed for convex optimization. We propose several notions of what it means for two algorithms to be equivalent, and provide computationally tractable means to detect equivalence. Our main definition, oracle equivalence, states that two algorithms are equivalent if they result in the same sequence of calls to the function oracles (for suitable initialization). Borrowing from control theory, we use state-space realizations to represent algorithms and characterize algorithm equivalence via transfer functions. Our framework can also identify and characterize some algorithm transformations including permutations of the update equations, repetition of the iteration, and conjugation of some of the function oracles in the algorithm. To support the paper, we have developed a software package named Linnaeus that implements the framework to identify other iterative algorithms that are equivalent to an input algorithm. More broadly, this framework and software advances the goal of making mathematics searchable. - Exponential stability analysis via integral quadratic constraints (arXiv)
R. Boczar, L. Lessard, A. Packard, and B. Recht.abstract
The theory of integral quadratic constraints (IQCs) allows verification of stability and gain-bound properties of systems containing nonlinear or uncertain elements. Gain bounds often imply exponential stability, but it can be challenging to compute useful numerical bounds on the exponential decay rate. This work presents a generalization of the classical IQC results of Megretski and Rantzer [19] that leads to a tractable computational procedure for finding exponential rate certificates that are far less conservative than ones computed from L2 gain bounds alone. An expanded library of IQCs for certifying exponential stability is also provided and the effectiveness of the technique is demonstrated via numerical examples. - Optimal control for LQG systems on graphs—Part I: structural results (arXiv)
A. Nayyar and L. Lessard.abstract
In this two-part paper, we identify a broad class of decentralized output-feedback LQG systems for which the optimal control strategies have a simple intuitive estimation structure and can be computed efficiently. Roughly, we consider the class of systems for which the coupling of dynamics among subsystems and the inter-controller communication is characterized by the same directed graph. Furthermore, this graph is assumed to be a multitree, that is, its transitive reduction can have at most one directed path connecting each pair of nodes. In this first part, we derive sufficient statistics that may be used to aggregate each controller's growing available information. Each controller must estimate the states of the subsystems that it affects (its descendants) as well as the subsystems that it observes (its ancestors). The optimal control action for a controller is a linear function of the estimate it computes as well as the estimates computed by all of its ancestors. Moreover, these state estimates may be updated recursively, much like a Kalman filter.
Journal Papers
- Inclusion of concentrated solar thermal power in Northeastern University's mechanical engineering curriculum (doi)
B. Lynch, U. Coskun, G. Kowalski, L. Lessard, Y. Levendis, B. Maheswaran, H. Metghalchi, H. Noorian, R. Sipahi, Y. Tjiptowidjojo, and Y. Yazicioglu. ASME Open Journal of Engineering, 4, Sep 2025.abstract
Global energy consumption continues to surge, demanding a transition from fossil fuels to cleaner and more sustainable alternatives. A variety of renewable energy sources—solar, wind, hydro, and geothermal—are critical to this transformation, with each offering diverse and regionally adaptive solutions. Among these sources, solar energy has become a dominant force through both photovoltaic (PV) and solar thermal technologies. While PV systems remain the leading force in regard to rapid deployment and decentralized applications, concentrated solar thermal power (CSTP) systems offer a unique advantage of thermal energy storage. Thermal energy storage offers an affordable and efficient form of dispatchable electricity generation and industrial process heat. Despite its benefits, CSTP remains a niche and is vastly underrepresented in engineering curricula across the United States. This article presents a comprehensive initiative at Northeastern University to address this educational gap by systematically institutionalizing CSTP content across nine mechanical engineering courses from the first year through the graduate level. Through hands-on projects, advanced simulations, and heliostat-focused design challenges, engineering students gain practical and theoretical exposure to CSTP technologies. By aligning curriculum development with the goals of the Department of Energy (DOE) and Heliostat Consortium (HelioCon), Northeastern University establishes a replicable model for integrating CSTP education and preparing a new generation of engineers to meet the growing demands of the clean energy transition. - The fastest known first-order method for minimizing twice continuously differentiable smooth strongly convex functions (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Control Systems Letters, 9, 655–660, Jun 2025.abstract
We consider iterative gradient-based optimization algorithms applied to functions that are smooth and strongly convex. The fastest globally convergent algorithm for this class of functions is the Triple Momentum (TM) method. We show that if the objective function is also twice continuously differentiable, a new, faster algorithm emerges, which we call C²-Momentum (C2M). We prove that C2M is globally convergent and that its worst-case convergence rate is strictly faster than that of TM, with no additional computational cost. We validate our theoretical findings with numerical examples, demonstrating that C2M outperforms TM when the objective function is twice continuously differentiable. - Stealthy optimal range-sensor placement for target localization (doi, arXiv)
M.H. Yoosefian Nooshabadi, R. Sipahi, and L. Lessard. IEEE Control Systems Letters, 8, 2763–2768, Dec 2024.abstract
We study a stealthy range-sensor placement problem where a set of range sensors are to be placed with respect to targets to effectively localize them while maintaining a degree of stealthiness from the targets. This is an open and challenging problem since two competing objectives must be balanced: (a) optimally placing the sensors to maximize their ability to localize the targets and (b) minimizing the information the targets gather regarding the sensors. We provide analytical solutions in 2D for the case of any number of sensors that localize two targets. - Optimal control of multi-agent systems with processing delays (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Transactions on Automatic Control, 69(8), 5141–5153, Aug 2024.abstract
In this article, we consider a cooperative control problem involving dynamically decoupled linear plants. The (output-feedback) controllers for each plant communicate with each other according to a fixed and known network topology, and each transmission incurs a fixed continuous-time processing delay. We provide an explicit closed-form expression for the optimal decentralized controller and its associated cost under these communication constraints and standard linear quadratic Gaussian (LQG) assumptions for the plants and cost function. We find the exact solution without discretizing or otherwise approximating the delays. We also present an implementation of each sub-controller that is efficiently computable, and is composed of standard finite-dimensional linear time-invariant (LTI) and finite impulse response (FIR) components, and has an intuitive observer-regulator architecture reminiscent of the classical separation principle. - Information-theoretic multi-time-scale partially observable systems with inspiration from leukemia treatment (doi, arXiv)
M.P. Chapman, E. Jensen, S.M. Chan, and L. Lessard. Automatica, 163, 111546, May 2024.abstract
We study a partially observable nonlinear stochastic system with unknown parameters, where the given time scales of the states and measurements may be distinct. The proposed setting is inspired by disease management, particularly leukemia. - Generalized necessary and sufficient robust boundedness results for feedback systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Transactions on Automatic Control, 68(9), 5693–5698, Sep 2023.abstract
Classical conditions for ensuring the robust stability of a system in feedback with a nonlinearity include passivity, small gain, circle, and conicity theorems. We present a generalized and unified version of these results in an arbitrary semi-inner product space, which avoids many of the technicalities that arise when working in traditional extended spaces. Our general formulation clarifies when the sufficient conditions for robust stability are also necessary, and we show how to construct worst-case scenarios when the sufficient conditions fail to hold. Finally, we show how our general result can be specialized to recover a wide variety of existing results, and explain how properties such as boundedness, causality, linearity, and time-invariance emerge as a natural consequence. - Guaranteed stability margins for decentralized linear quadratic regulators (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Control Systems Letters, 7, 1778–1782, May 2023.abstract
It is well-known that linear quadratic regulators (LQR) enjoy guaranteed stability margins, whereas linear quadratic Gaussian regulators (LQG) do not. In this letter, we consider systems and compensators defined over directed acyclic graphs. In particular, there are multiple decision-makers, each with access to a different part of the global state. In this setting, the optimal LQR compensator is dynamic and is similar to a classical LQG compensator; its state can be interpreted as a filter that predicts unobserved parts of the plant state. We show that when sub-controller input costs are decoupled (but there is possible coupling between sub-controller state costs), the decentralized LQR compensator enjoys similar guaranteed stability margins to classical LQR. However, these guarantees disappear when cost coupling is introduced. - Unifying effects of direct and relational associations for visual communication (doi, arXiv)
M.A. Schoenlein, J. Campos, K.J. Lande, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 29(1), 385–395, Jan 2023.abstract
People have expectations about how colors map to concepts in visualizations, and they are better at interpreting visualizations that match their expectations. Traditionally, studies on these expectations (inferred mappings) distinguished distinct factors relevant for visualizations of categorical vs. continuous information. Studies on categorical information focused on direct associations (e.g., mangos are associated with yellows) whereas studies on continuous information focused on relational associations (e.g., darker colors map to larger quantities; dark-is-more bias). We unite these two areas within a single framework of assignment inference. Assignment inference is the process by which people infer mappings between perceptual features and concepts represented in encoding systems. Observers infer globally optimal assignments by maximizing the "merit," or "goodness," of each possible assignment. Previous work on assignment inference focused on visualizations of categorical information. We extend this approach to visualizations of continuous data by (a) broadening the notion of merit to include relational associations and (b) developing a method for combining multiple (sometimes conflicting) sources of merit to predict people's inferred mappings. We developed and tested our model on data from experiments in which participants interpreted colormap data visualizations, representing fictitious data about environmental concepts (sunshine, shade, wild fire, ocean water, glacial ice). We found both direct and relational associations contribute independently to inferred mappings. These results can be used to optimize visualization design to facilitate visual communication. - A universal decomposition for distributed optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Control Systems Letters, 6, 3044–3049, Jun 2022.abstract
In the distributed optimization problem for a multi-agent system, each agent knows a local function and must find a minimizer of the sum of all agents' local functions by performing a combination of local gradient evaluations and communicating information with neighboring agents. We prove that every distributed optimization algorithm can be factored into a centralized optimization method and a second-order consensus estimator, effectively separating the "optimization" and "consensus" tasks. We illustrate this fact by providing the decomposition for many recently proposed distributed optimization algorithms. Conversely, we prove that any optimization method that converges in the centralized setting can be combined with any second-order consensus estimator to form a distributed optimization algorithm that converges in the multi-agent setting. Finally, we describe how our decomposition may lead to a more systematic algorithm design methodology. - The analysis of optimization algorithms: A dissipativity approach (doi, arXiv)
L. Lessard. IEEE Control Systems Magazine, 42(3), 58–72, Jun 2022.abstract
Optimization problems in engineering and applied mathematics are typically solved in an iterative fashion, by systematically adjusting the variables of interest until an adequate solution is found. The iterative algorithms that govern these systematic adjustments can be viewed as a control system. In control systems, the output in measured and the input is adjusted using feedback to drive the error to zero. Similarly, in iterative algorithms, the optimization objective is evaluated and the candidate solution is adjusted to drive it toward the optimal point. Choosing an algorithm that works well for a variety of optimization problems is akin to robust controller design. Just as dissipativity theory can be used to analyze the stability properties of control systems, it can also be used to analyze the convergence properties of iterative algorithms. By defining an appropriate notion of "energy" that dissipates with every iteration of the algorithm, the convergence properties of the algorithm can be characterized. This article formalizes the connection between iterative algorithms and control systems and shows through examples how dissipativity theory can be used to analyze the performance of many classes of optimization algorithms. This control-theoretic viewpoint enables the selection and tuning of optimization algorithms to be performed in an automated and systematic way. - Context matters: A theory of semantic discriminability for perceptual encoding systems (doi, arXiv)
K. Mukherjee, B. Yin, B.E. Sherman, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 28(1), 697–706, Jan 2022.abstract
People's associations between colors and concepts influence their ability to interpret the meanings of colors in information visualizations. Previous work has suggested such effects are limited to concepts that have strong, specific associations with colors. However, although a concept may not be strongly associated with any colors, its mapping can be disambiguated in the context of other concepts in an encoding system. We articulate this view in semantic discriminability theory, a general framework for understanding conditions determining when people can infer meaning from perceptual features. Semantic discriminability is the degree to which observers can infer a unique mapping between visual features and concepts. Semantic discriminability theory posits that the capacity for semantic discriminability for a set of concepts is constrained by the difference between the feature-concept association distributions across the concepts in the set. We define formal properties of this theory and test its implications in two experiments. The results show that the capacity to produce semantically discriminable colors for sets of concepts was indeed constrained by the statistical distance between color-concept association distributions (Experiment 1). Moreover, people could interpret meanings of colors in bar graphs insofar as the colors were semantically discriminable, even for concepts previously considered "non-colorable" (Experiment 2). The results suggest that colors are more robust for visual communication than previously thought. - A unified analysis of first-order methods for smooth games via integral quadratic constraints (url, arXiv)
G. Zhang, X. Bao, L. Lessard, and R. Grosse. Journal of Machine Learning Research, 22(103), 1–39, Jun 2021.abstract
The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements. In this work, we adapt the IQC theory to study first-order methods for smooth and strongly-monotone games and show how to design tailored quadratic constraints to get tight upper bounds of convergence rates. Using this framework, we recover the existing bound for the gradient method (GD), derive sharper bounds for the proximal point method (PPM) and optimistic gradient method (OG), and provide for the first time a global convergence rate for the negative momentum method (NM) with an iteration complexity O(kappa^1.5), which matches its known lower bound. In addition, for time-varying systems, we prove that the gradient method with optimal step size achieves the fastest provable worst-case convergence rate with quadratic Lyapunov functions. Finally, we further extend our analysis to stochastic games and study the impact of multiplicative noise on different algorithms. We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch (in contrast with the stochastic strongly-convex optimization setting, where such acceleration has been demonstrated). However, we exhibit an algorithm which achieves acceleration with two gradient queries per batch. - Toward a scalable upper bound for a CVaR-LQ problem (doi, arXiv)
M.P. Chapman and L. Lessard. IEEE Control Systems Letters, 6, 920–925, Jun 2021.abstract
We consider a linear-quadratic optimal control problem in discrete time with distributional ambiguity, where the random cumulative quadratic cost is assessed via the Conditional Value-at-Risk (CVaR) functional. We take steps toward deriving a scalable value iteration algorithm to upper-bound the solution to this problem. A remarkable finding is that the value functions depend on positive definite matrices, which are computed by a Riccati-like recursion. - Analysis of biased stochastic gradient descent using sequential semidefinite programs (doi, arXiv)
B. Hu, P. Seiler, and L. Lessard. Mathematical Programming, 187(1), 384–408, May 2021.abstract
We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions. - Semantic discriminability for visual communication (doi, arXiv)
K.B. Schloss, Z. Leggon, and L. Lessard. IEEE Transactions on Visualization and Computer Graphics, 27(2), 1022–1031, Feb 2021.abstract
To interpret information visualizations, observers must determine how visual features map onto concepts. First and foremost, this ability depends on perceptual discriminability; observers must be able to see the difference between different colors for those colors to communicate different meanings. However, the ability to interpret visualizations also depends on semantic discriminability, the degree to which observers can infer a unique mapping between visual features and concepts, based on the visual features and concepts alone (i.e., without help from verbal cues such as legends or labels). Previous evidence suggested that observers were better at interpreting encoding systems that maximized semantic discriminability (maximizing association strength between assigned colors and concepts while minimizing association strength between unassigned colors and concepts), compared to a system that only maximized color-concept association strength. However, increasing semantic discriminability also resulted in increased perceptual distance, so it is unclear which factor was responsible for improved performance. In the present study, we conducted two experiments that tested for independent effects of semantic distance and perceptual distance on semantic discriminability of bar graph data visualizations. Perceptual distance was large enough to ensure colors were more than just noticeably different. We found that increasing semantic distance improved performance, independent of variation in perceptual distance, and when these two factors were uncorrelated, responses were dominated by semantic distance. These results have implications for navigating trade-offs in color palette design optimization for visual communication. - Analysis and design of first-order distributed optimization algorithms over time-varying graphs (doi, arXiv)
A. Sundararajan, B. Van Scoy, and L. Lessard. IEEE Transactions on Control of Network Systems, 7(4), 1597–1608, Dec 2020.abstract
This work concerns the analysis and design of distributed first-order optimization algorithms over time-varying graphs. The goal of such algorithms is to optimize a global function that is the average of local functions using only local computations and communications. Several different algorithms have been proposed that achieve linear convergence to the global optimum when the local functions are strongly convex. We provide a unified analysis that yields a worst-case linear convergence rate as a function of the condition number of the local functions, the spectral gap of the graph, and the parameters of the algorithm. The framework requires solving a small semidefinite program whose size is fixed; it does not depend on the number of local functions or the dimension of the domain. The result is a computationally efficient method for distributed algorithm analysis that enables the rapid comparison, selection, and tuning of algorithms. Finally, we propose a new algorithm, which we call SVL, that is easily implementable and achieves the fastest possible worst-case convergence rate among all algorithms in the family we considered. We support our theoretical analysis with numerical experiments that generate worst-case examples demonstrating the effectiveness of SVL. - The relation between color and spatial structure for interpreting colormap data visualizations (doi)
S.C. Sibrel, R. Rathore, L. Lessard, and K.B. Schloss. Journal of Vision, 20(12), 7, Nov 2020.abstract
Interpreting colormap visualizations requires determining how dimensions of color in visualizations map onto quantities in data. People have color-based biases that influence their interpretations of colormaps, such as a dark-is-more bias---darker colors map to larger quantities. Previous studies of color-based biases focused on colormaps with weak data spatial structure, but color-based biases may not generalize to colormaps with strong data spatial structure, like "hotspots" typically found in weather maps and neuroimaging brain maps. There may be a hotspot-is-more bias to infer that colors within hotspots represent larger quantities, which may override the dark-is-more bias. We tested this possibility in four experiments. Participants saw colormaps with hotspots and a legend that specified the color-quantity mapping. Their task was to indicate which side of the colormap depicted larger quantities (left/right). We varied whether the legend specified dark-more mapping or light-more mapping across trials and operationalized a dark-is-more bias as faster response time (RT) when the legend specified dark-more mapping. Experiment 1 demonstrated robust evidence for the dark-is-more bias, without evidence for a hotspot-is-more bias. Experiments 2 to 4 suggest that a hotspot-is-more bias becomes relevant when hotspots are a statistically reliable cue to "more" (i.e., the locus of larger quantities) and when hotspots are more perceptually pronounced. Yet, comparing conditions in which the hotspots were "more," RTs were always faster for dark hotspots than light hotspots. Thus, in the presence of strong spatial cues to the locus of larger quantities, color-based biases still influenced interpretations of colormap data visualizations. - Estimating color-concept associations from image statistics (doi, arXiv)
R. Rathore, Z. Leggon, L. Lessard, and K.B. Schloss. IEEE Transactions on Visualization and Computer Graphics, 26(1), 1226–1235, Jan 2020.abstract
To interpret the meanings of colors in visualizations of categorical information, people must determine how distinct colors correspond to different concepts. This process is easier when assignments between colors and concepts in visualizations match people’s expectations, making color palettes semantically interpretable. Efforts have been underway to optimize color palette design for semantic interpretablity, but this requires having good estimates of human color-concept associations. Obtaining these data from humans is costly, which motivates the need for automated methods. We developed and evaluated a new method for automatically estimating color-concept associations in a way that strongly correlates with human ratings. Building on prior studies using Google Images, our approach operates directly on Google Image search results without the need for humans in the loop. Specifically, we evaluated several methods for extracting raw pixel content of the images in order to best estimate color-concept associations obtained from human ratings. The most effective method extracted colors using a combination of cylindrical sectors and color categories in color space. We demonstrate that our approach can accurately estimate average human color-concept associations for different fruits using only a small set of images. The approach also generalizes moderately well to more complicated recycling-related concepts of objects that can appear in any color. - Design considerations for the enhancement of human color vision by breaking binocular redundancy (doi, arXiv)
B. Gundlach, M. Frising, A. Shahsafi, G. Vershbow, C. Wan, J. Salman, B. Rokers, L. Lessard, and M. Kats. Scientific Reports, 8, 11971, Aug 2018.abstract
To see color, the human visual system combines the response of three types of cone cells in the retina—a compressive process that discards a significant amount of spectral information. Here, we present designs based on thin-film optical filters with the goal of enhancing human color vision by breaking its inherent binocular redundancy, providing different spectral content to each eye. We fabricated a set of optical filters that “splits” the response of the short-wavelength cone between the two eyes in individuals with typical trichromatic vision, simulating the presence of approximately four distinct cone types. Such an increase in the number of effective cone types can reduce the prevalence of metamers—pairs of distinct spectra that resolve to the same tristimulus values. This technique may result in an enhancement of spectral perception, with applications ranging from camouflage detection and anti-counterfeiting to new types of artwork and data visualization. - Color inference in visual communication: The meaning of colors in recycling (doi)
K.B. Schloss, L. Lessard, C.S. Walmsley, and K. Foley. Cognitive Research: Principles and Implications, 3(5), 1–17, Feb 2018.abstract
People interpret abstract meanings from colors, which makes color a useful perceptual feature for visual communication. This process is complicated, however, because there is seldom a one-to-one correspondence between colors and meanings. One color can be associated with many different concepts (one-to-many mapping) and many colors can be associated with the same concept (many-to-one mapping). We propose that to interpret color-coding systems, people perform assignment inference to determine how colors map onto concepts. We studied assignment inference in the domain of recycling. Participants saw images of colored but unlabeled bins and were asked to indicate which bins they would use to discard different kinds of recyclables and trash. In Experiment 1, we tested two hypotheses for how people perform assignment inference. The local assignment hypothesis predicts that people simply match objects with their most strongly associated color. The global assignment hypothesis predicts that people also account for the association strengths between all other objects and colors within the scope of the color-coding system. Participants discarded objects in bins that optimized the color-object associations of the entire set, which is consistent with the global assignment hypothesis. This sometimes resulted in discarding objects in bins whose colors were weakly associated with the object, even when there was a stronger associated option available. In Experiment 2, we tested different methods for encoding color-coding systems and found that people were better at assignment inference when color sets simultaneously maximized the association strength between assigned color-object parings while minimizing associations between unassigned pairings. Our study provides an approach for designing intuitive color-coding systems that facilitate communication through visual media such as graphs, maps, signs, and artifacts. - Modeling color preference using color space metrics (doi)
K.B. Schloss, L. Lessard, C. Racey, and A.C. Hurlbert. Vision Research, 151, 99–116, Jul 2017.abstract
Studying color preferences provides a means to discover how perceptual experiences map onto cognitive and affective judgments. A challenge is finding a parsimonious way to describe and predict patterns of color preferences, which are complex with rich individual differences. One approach has been to model color preferences using factors from metric color spaces to establish direct correspondences between dimensions of color and preference. Prior work established that substantial, but not all, variance in color preferences could be captured by weights on color space dimensions using multiple linear regression. The question we address here is whether model fits may be improved by using different color metric specifications. We therefore conducted a large-scale analysis of color space models, and focused in-depth analysis on models that differed in color space (cone-contrast vs. CIELAB), coordinate system within the color space (Cartesian vs. cylindrical), and factor degrees (1st degree only, or 1st and 2nd degree). We used k-fold cross validation to avoid over-fitting the data and to ensure fair comparisons across models. The best model was the 2nd-harmonic Lch model ("LabC Cyl2"). Specified in CIELAB space, it included 1st and 2nd harmonics of hue (capturing opponency in hue preferences and simultaneous liking/disliking of both hues on an opponent axis, respectively), lightness, and chroma. These modeling approaches can be used to characterize and compare patterns for group averages and individuals in future datasets on color preference, or other measures in which correspondences between color appearance and cognitive or affective judgments may exist. - Convexity of decentralized controller synthesis (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Automatic Control, 61(10), 3122–3127, Oct 2016.abstract
In decentralized control problems, a standard approach is to specify the set of allowable decentralized controllers as a closed subspace of linear operators. This then induces a corresponding set of Youla parameters. Previous work has shown that quadratic invariance of the controller set implies that the set of Youla parameters is convex. In this paper, we prove the converse. We thereby show that the only decentralized control problems for which the set of Youla parameters is convex are those which are quadratically invariant. We further show that under additional assumptions, quadratic invariance is necessary and sufficient for the set of achievable closed-loop maps to be convex. We give two versions of our results. The first applies to bounded linear operators on a Banach space and the second applies to (possibly unstable) causal LTI systems in discrete or continuous time. - Analysis and design of optimization algorithms via integral quadratic constraints (doi, arXiv)
L. Lessard, B. Recht, and A. Packard. SIAM Journal on Optimization, 26(1), 57–95, Jan 2016.abstract
This paper develops a new framework to analyze and design iterative optimization algorithms built on the notion of integral quadratic constraints (IQCs) from robust control theory. IQCs provide sufficient conditions for the stability of complicated interconnected systems, and these conditions can be checked by semidefinite programming. We discuss how to adapt IQC theory to study optimization algorithms, proving new inequalities about convex functions and providing a version of IQC theory adapted for use by optimization researchers. Using these inequalities, we derive numerical upper bounds on convergence rates for the Gradient method, the Heavy-ball method, Nesterov's accelerated method, and related variants by solving small, simple semidefinite programming problems. We also briefly show how these techniques can be used to search for optimization algorithms with desired performance characteristics, establishing a new methodology for algorithm design. - Compositional performance certification of interconnected systems using ADMM (doi, arXiv)
C. Meissen, L. Lessard, M. Arcak, and A. Packard. Automatica, 61, 55–63, Nov 2015.abstract
A compositional performance certification method is presented for interconnected systems using subsystem dissipativity properties and the interconnection structure. A large-scale optimization problem is formulated to search for the most relevant dissipativity properties. The alternating direction method of multipliers (ADMM) is employed to decompose and solve this problem, and is demonstrated on several examples. - Optimal decentralized state-feedback control with sparsity and delays (doi, arXiv)
A. Lamperski and L. Lessard. Automatica, 58, 143–151, Aug 2015.abstract
This work presents the solution to a class of decentralized linear quadratic state-feedback control problems, in which the plant and controller must satisfy the same combination of delay and sparsity constraints. Using a novel decomposition of the noise history, the control problem is split into independent subproblems that are solved using dynamic programming. The approach presented herein both unifies and generalizes many existing results. - Optimal control of two-player systems with output feedback (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Automatic Control, 60(8), 2129–2144, Aug 2015.abstract
In this article, we consider a fundamental decentralized optimal control problem, which we call the two-player problem. Two subsystems are interconnected in a nested information pattern, and output feedback controllers must be designed for each subsystem. Several special cases of this architecture have previously been solved, such as the state-feedback case or the case where the dynamics of both systems are decoupled. In this paper, we present a detailed solution to the general case. The structure of the optimal decentralized controller is reminiscent of that of the optimal centralized controller; each player must estimate the state of the system given their available information and apply static control policies to these estimates to compute the optimal controller. The previously solved cases benefit from a separation between estimation and control which allows one to compute the control and estimation gains separately. This feature is not present in general, and some of the gains must be solved for simultaneously. We show that computing the required coupled estimation and control gains amounts to solving a small system of linear equations. - An algebraic approach to the control of decentralized systems (doi, arXiv)
L. Lessard and S. Lall. IEEE Transactions on Control of Network Systems, 1(4), 308–317, Dec 2014.abstract
Optimal decentralized controller design is notoriously difficult, but recent research has identified large subclasses of such problems that may be convexified and thus are amenable to solution via efficient numerical methods. One recently discovered sufficient condition for convexity is \emph{quadratic invariance} (QI). Despite the simple algebraic characterization of QI, which relates the plant and controller maps, proving convexity of the set of achievable closed-loop maps requires tools from functional analysis. In this work, we present a new formulation of quadratic invariance that is purely algebraic. While our results are similar in flavor to those from traditional QI theory, they do not follow from that body of work. Furthermore, they are applicable to new types of systems that are difficult to treat using functional analysis. Examples discussed include rational transfer matrices, systems with delays, and multidimensional systems. - Experimental validation of single-iteration multigrid wavefront reconstruction at the Palomar Observatory (doi)
L. Lessard, D. MacMynowski, M. West, A. Bouchez, and S. Lall. Optics Letters, 33(18), 2047–2049, Sep 2008.abstract
Single-iteration multigrid (SIMG) wavefront reconstruction schemes were implemented and validated on the adaptive optics system at the Hale 5.1m telescope at the Palomar Observatory. Results indicate that even the simplest such method produces a performance indistinguishable from that of the standard least-squares reconstructor for both bright and dim guide stars. SIMG provides a dramatic reduction in computational cost when compared to vector–matrix multiplication and can be implemented in parallel, making it the obvious choice for reconstruction in future large-scale adaptive optics systems. - Warm-started wavefront reconstruction for adaptive optics (doi)
L. Lessard, M. West, D. MacMynowski, and S. Lall. Journal of the Optical Society of America A, 25(5), 1147–1155, May 2008.abstract
Future extreme adaptive optics (ExAO) systems have been suggested with up to 105 sensors and actuators. We analyze the computational speed of iterative reconstruction algorithms for such large systems. We compare a total of 15 different scalable methods, including multigrid, preconditioned conjugate-gradient, and several new variants of these. Simulations on a 128×128 square sensor/actuator geometry using Taylor frozen-flow dynamics are carried out using both open-loop and closed-loop measurements, and algorithms are compared on a basis of the mean squared error and floating-point multiplications required. We also investigate the use of warm starting, where the most recent estimate is used to initialize the iterative scheme. In open-loop estimation or pseudo-open-loop control, warm starting provides a significant computational speedup; almost every algorithm tested converges in one iteration. In a standard closed-loop implementation, using a single iteration per time step, most algorithms give the minimum error even in cold start, and every algorithm gives the minimum error if warm started. The best algorithm is therefore the one with the smallest computational cost per iteration, not necessarily the one with the best quasi-static performance.
Peer-Reviewed Conference Proceedings
- State estimation for linear systems with non-Gaussian measurement noise via dynamic programming (doi, arXiv)
M.H. Yoosefian Nooshabadi and L. Lessard. IEEE Conference on Decision and Control (CDC), 5612–5617, Dec 2025. (Rio de Janeiro, Brazil)abstract
We propose a new recursive estimator for linear dynamical systems under Gaussian process noise and non-Gaussian measurement noise. Specifically, we develop an approximate maximum a posteriori (MAP) estimator using dynamic programming and tools from convex analysis. Our approach does not rely on restrictive noise assumptions and employs a Bellman-like update instead of a Bayesian update. Our proposed estimator is computationally efficient, with only modest overhead compared to a standard Kalman filter. Simulations demonstrate that our estimator achieves lower root mean squared error (RMSE) than the Kalman filter and has comparable performance to state-of-the-art estimators, while requiring significantly less computational power. - Adaptive acceleration without strong convexity priors or restarts (url, arXiv)
J.V. Cavalcanti, L. Lessard, and A.C. Wilson. OPT2025 workshop at the Conference on Neural Information Processing Systems (NeurIPS), Dec 2025. (San Diego, CA)abstract
In this paper, we propose a parameter-free algorithm for smooth and strongly convex objective problems called NAG-free. To our knowledge, NAG-free is the first adaptive algorithm capable of directly estimating the strong convexity parameter without priors or resorting to restart schemes. We prove that NAG-free converges globally at least as fast gradient descent, and achieves accelerated convergence locally around the minimum if the Hessian is locally smooth at the minimum and other mild additional assumptions hold. We present real-world experiments in which NAG-free is competitive with restart schemes and adapts to better local curvature conditions. - Spacecraft attitude control under reaction wheel constraints using control Lyapunov and control barrier functions
(doi, arXiv)
M.A. Shahraki and L. Lessard. American Control Conference (ACC), 940–945, Jul 2025. (Denver, CO)abstract
This paper introduces a novel control strategy for agile spacecraft attitude control, addressing reaction wheel-related input and state constraints. An optimal-decay control Lyapunov function quadratic program stabilizes the system and mitigates chattering at low sampling frequencies, while control barrier functions enforce hard state constraints. Numerical simulations validate the method's practicality and efficiency for real-time agile spacecraft attitude control. - Keeping up with dynamic attackers: Certifying robustness to adaptive online data poisoning (url, arXiv)
A. Bose, L. Lessard, M. Fazel, and K. Dvijotham. International Conference on Artificial Intelligence and Statistics (AISTATS), May 2025. (Mai Khao, Thailand)abstract
The rise of foundation models fine-tuned on human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. Existing research on provable certified robustness against data poisoning attacks primarily focuses on certifying robustness for static adversaries who modify a fraction of the dataset used to train the model before the training algorithm is applied. In practice, particularly when learning from human feedback in an online sense, adversaries can observe and react to the learning process and inject poisoned samples that optimize adversarial objectives better than when they are restricted to poisoning a static dataset once, before the learning algorithm is applied. Indeed, it has been shown in prior work that online dynamic adversaries can be significantly more powerful than static ones. We present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean estimation problem and binary classification problems and outline directions for extending this in further work. The code to implement our certificates and replicate our results is available at https://github.com/Avinandan22/Certified-Robustness - Adaptive backtracking line search (url, arXiv)
J.V. Cavalcanti, L. Lessard, and A.C. Wilson. International Conference on Learning Representations (ICLR), Apr 2025. (Singapore)abstract
Backtracking line search is foundational in numerical optimization. The basic idea is to adjust the step size of an algorithm by a constant factor until some chosen criterion (e.g. Armijo, Goldstein, Descent Lemma) is satisfied. We propose a new way for adjusting step sizes, replacing the constant factor used in regular backtracking with one that takes into account the degree to which the chosen criterion is violated, without additional computational burden. For convex problems, we prove adaptive backtracking requires fewer adjustments to produce a feasible step size than regular backtracking does for two popular line search criteria: the Armijo condition and the descent lemma. For nonconvex smooth problems, we additionally prove adaptive backtracking enjoys the same guarantees of regular backtracking. Finally, we perform a variety of experiments on over fifteen real world datasets, all of which confirm that adaptive backtracking often leads to significantly faster optimization. - Model predictive planning: Trajectory planning in obstruction-dense environments for low-agility aircraft (doi, arXiv)
M.T. Wallace, B. Streetman, and L. Lessard. IEEE Conference on Decision and Control (CDC), 8288–8293, Dec 2024. (Milan, Italy)abstract
We present Model Predictive Planning (MPP), a trajectory planner for low-agility vehicles such as a fixed-wing aircraft to navigate obstacle-laden environments. MPP consists of (1) a multi-path planning procedure that identifies candidate paths, (2) a raytracing procedure that generates linear constraints around these paths to enforce obstacle avoidance, and (3) a convex quadratic program that finds a feasible trajectory within these constraints if one exists. Low-agility aircraft cannot track arbitrary paths, so refining a given path into a trajectory that respects the vehicle's limited maneuverability and avoids obstacles often leads to an infeasible optimization problem. The critical feature of MPP is that it efficiently considers multiple candidate paths during the refinement process, thereby greatly increasing the chance of finding a feasible and trackable trajectory. We demonstrate the effectiveness of MPP on a longitudinal aircraft model. - Certifying robustness to adaptive data poisoning (url)
A. Bose, M. Udell, L. Lessard, M. Fazel, and K. Dvijotham. FoRLaC workshop at the International Conference on Machine Learning (ICML), Jul 2024. (Vienna, Austria)abstract
The rise of foundational models fine-tuned with human feedback from potentially untrusted users has increased the risk of adversarial data poisoning, necessitating the study of robustness of learning algorithms against such attacks. While existing research focuses on certifying robustness for static adversaries acting on offline datasets, dynamic attack algorithms have shown to be more effective. Relevant for models with periodic updates where an adversary can adapt based on the algorithm's behavior, such as those in RLHF, we present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. We give an illustration of the framework for the mean-estimation problem. - Orthonormal polynomial bases for airfoil design (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. IEEE Aerospace Conference (AeroConf), 1–9, Mar 2024. (Big Sky, MT)abstract
Shape parameterization plays an important role in the process of Aerodynamic Shape Optimization (ASO). A good parameterization allows the optimizer to easily explore the design space at multiple scales, while not adversely affecting the convergence or accuracy of the solution. Over the last several decades, a wide variety of methods have been proposed and characterized for representing aerodynamic shapes, including various polynomial representations, classes of splines, and radial basis functions. Our research into the application of convex optimization techniques to aerospace design has motivated us to revisit and expand these techniques, particularly for those with convex representations. For such parameterizations, when coupled with convex objective functions and constraints, convergence is guaranteed to globally optimal solutions in polynomial time. This guarantee allows for comparison of the results obtained with different shape representations without potentially adverse complications related to selection of initial conditions, local optima, or noisy gradients. In the course of this work we make several contributions. First, we provide orthonormal extensions to several common polynomial bases. Second, we demonstrate efficient objective function representation for supersonic drag minimization problems when employing an integral form of such orthonormal bases. Finally, we develop an analytic solution to a sample benchmark problem and compare the solutions offered by a finite linear combination of design variables with basis functions, to the underlying continuous result. - Automated Lyapunov analysis of primal-dual optimization algorithms: An interpolation approach (doi, arXiv)
B. Van Scoy, J.W. Simpson-Porco, and L. Lessard. IEEE Conference on Decision and Control (CDC), 1306–1311, Dec 2023. (Singapore)abstract
Primal-dual algorithms are frequently used for iteratively solving large-scale convex optimization problems. The analysis of such algorithms is usually done on a case-by-case basis, and the resulting guaranteed rates of convergence can be conservative. Here we consider a class of first-order algorithms for linearly constrained convex optimization problems, and provide a linear matrix inequality (LMI) analysis framework for certifying worst-case exponential convergence rates. Our approach builds on recent results for interpolation of convex functions and linear operators, and our LMI directly constructs a Lyapunov function certifying the guaranteed convergence rate. By comparing to rates established in the literature, we show that our approach can certify significantly faster convergence for this family of algorithms. - A tutorial on a Lyapunov-based approach to the analysis of iterative optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3003–3008, Dec 2023. (Singapore)abstract
Iterative gradient-based optimization algorithms are widely used to solve difficult or large-scale optimization problems. There are many algorithms to choose from, such as gradient descent and its accelerated variants such as Polyak's Heavy Ball method or Nesterov's Fast Gradient method. It has long been observed that iterative algorithms can be viewed as dynamical systems, and more recently, as robust controllers. Here, the "uncertainty" in the dynamics is the gradient of the function being optimized. Therefore, worst-case or average-case performance can be analyzed using tools from robust control theory, such as integral quadratic constraints (IQCs). In this tutorial paper, we show how such an analysis can be carried out using an alternative Lyapunov-based approach. This approach recovers the same performance bounds as with IQCs, but with the added benefit of constructing a Lyapunov function. - A tutorial on the structure of distributed optimization algorithms (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3009–3014, Dec 2023. (Singapore)abstract
We consider the distributed optimization problem for a multi-agent system. Here, multiple agents cooperatively optimize an objective by sharing information through a communication network and performing computations. In this tutorial, we provide an overview of the problem, describe the structure of its algorithms, and use simulations to illustrate some algorithmic properties based on this structure. - Shape optimization of hypersonic lifting vehicles via convex relaxation (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA Aviation Forum (AVIATION), 1–12, Jun 2023. (San Diego, CA)abstract
In this paper, we extend our previous approach to the conceptual design of hypersonic aerospace vehicles, using convex optimization, to include lift as a performance indicator and constraint. Our approach employs the transformation of the lift coefficient formula to a concave function, by change of variables. This results in design problems featuring a convex objection function with a single nonconvex constraint, which we address using convex relaxation. Doing so allows us to approach a richer space of design problems which now include minimum lift constraints and treatment of lift to drag ratio within the objective function. We also apply numerical integration directly to the continuous differentials of both lift and drag, versus the panelized approach of our previous work. We demonstrate this approach on problems incorporating lift and drag, and also demonstrate a bilevel optimization problem to obtain a globally optimal maximum lift to drag coefficient vehicle. - Convex optimization of low observability hypersonic vehicles (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. IEEE Aerospace Conference (AeroConf), 1–11, Mar 2023. (Big Sky, MT)abstract
In this paper, we propose an approach to the conceptual design of high speed aerospace vehicles that addresses the coupled behavior of hypersonic aerodynamics and radar cross section. Our approach employs convex optimization, a branch of optimization theory that guarantees global optima for problems expressed with convex objective functions and constraints, combined with cubic splines as cross sectional representations. We demonstrate the process of creating convex surrogates using piecewise linear functions and apply these objective functions to useful test cases, employing a mixture of convex constraints on geometry. We also provide comparisons on the ability to converge to global optima between this type of convex optimization problem to a nonconvex, sequential quadratic programming solver. - Convex optimization of thin airfoils using cubic splines (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA SciTech Forum (SCITECH), 1–15, Jan 2023. (National Harbor, MD)abstract
Convex optimization offers an attractive approach to the physical design of aerospace vehicles due to its ability to quickly identify global optima to relatively complex problems. Building on previous work, we show a significant improvement in optimal supersonic drag coefficient for airfoils parameterized by cubic splines, versus cubic polynomials. In particular we derive convex expressions for the supersonic lift and drag coefficients of thin airfoils expressed as cubic splines, as well as subsonic lift and moment coefficients for the same. We compare the results of globally optimal designs parameterized by cubic polynomials and cubic splines, showing improvements in drag performance of approximately 20-30%. These designs are computed typically in single digit seconds using commodity hardware and open source software tools. - Absolute stability via lifting and interpolation (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 6217–6223, Dec 2022. (Cancún, Mexico)abstract
We revisit the classical problem of absolute stability; assessing the robust stability of a given linear time-invariant (LTI) plant in feedback with a nonlinearity belonging to some given function class. Standard results typically take the form of sufficient conditions on the LTI plant, the least conservative of which are based on O'Shea--Zames--Falb multipliers. We present an alternative analysis based on lifting and interpolation that directly constructs a Lyapunov function that certifies absolute stability without resorting to frequency-domain inequalities or integral quadratic constraints. In particular, we use linear matrix inequalities to search over Lyapunov functions that are quadratic in the iterates and linear in the corresponding function values of the system in a lifted space. We show that our approach recovers state-of-the-art results on a set of benchmark problems. - Information structures of the Kalman filter for the elastic wave equation (doi)
J. Arbelaiz, E. Jensen, B. Bamieh, A.E. Hosoi, A. Jadbabaie, and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 1–6, Jul 2022. (Zürich, Switzerland)abstract
We study the Kalman Filter for the linear elastic wave equation over the real line with spatially distributed partial state measurements. The dynamics of the filter are described by a spatial convolution operator with asymptotic exponential spatial decay rate. This decay rate dictates how measurements from different spatial locations must be exchanged to implement the filter: faster spatial decay implies local measurements are more relevant and the filter is more "decentralized"; slower decay implies farther measurements also become relevant and the filter is more "centralized". Using dimensional analysis, we demonstrate that this decay rate is a function of one dimensionless group defined from system parameters, such as wave speed and noise variances. We find a critical value of such dimensionless group for which the Kalman Filter is completely decentralized. - A convex optimization approach to thin airfoil design (doi)
D.C. Berkenstock, J.J. Alonso, and L. Lessard. AIAA Aviation Forum (AVIATION), 1–15, Jun 2022. (Chicago, IL)abstract
Convex optimization techniques can find global optima in polynomial time for problems with convex objective functions and constraints. In this paper, we demonstrate that objective functions involving lift, drag, and moment coefficients may be represented as convex functions when modeled using thin-airfoil theory. Additionally, we describe a rich set of constraints that may be formulated using convex representations. Finally, we apply these methods to the conceptual design of airfoils defined by cubic polynomials. In doing so, we demonstrate the ability to rapidly obtain globally-optimal, up to the limits of parameterization and physical model fidelity, airfoil shapes that maximize supersonic lift-to-drag ratio, given constraints on subsonic lift and moment coefficients, as well as various geometric constraints. Furthermore, we show the ability to expand this method to a bi-level optimization problem, with a single nonconvex variable, identifying a global solution for problems involving the placement of an internal payload. These problems, with up to one thousand constraints, typically run in single-digit seconds on commodity hardware. - Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization (url, arXiv)
G. Zhang, Y. Wang, L. Lessard, and R. Grosse. International Conference on Artificial Intelligence and Statistics (AISTATS), 7659–7679, Mar 2022. (Valencia, Spain, virtual)abstract
Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (Sim-GDA) in many settings. We prove that Alt-GDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the first result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate globally for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms. - Systematic analysis of distributed optimization algorithms over jointly-connected networks (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 3096–3101, Dec 2020. (Jeju Island, Republic of Korea, virtual)abstract
We consider the distributed optimization problem, where a group of agents work together to optimize a common objective by communicating with neighboring agents and performing local computations. Using tools from robust control, we develop a systematic analysis of a large class of distributed algorithms for solving this problem without using restrictive assumptions on the communication network. In particular, we assume only that the network is jointly connected over a finite time horizon (commonly referred to as B-connectivity), which does not require connectivity at each time instant. When applied to the distributed algorithm DIGing, our bounds are orders of magnitude tighter than those available in the literature. - Agent-level optimal LQG control of dynamically decoupled systems with processing delays (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Conference on Decision and Control (CDC), 5980–5985, Dec 2020. (Jeju Island, Republic of Korea, virtual)abstract
We consider the problem of controlling a set of dynamically decoupled plants where the plants' subcontrollers communicate with each other according to a fixed and known network topology. We assume the communication to be instantaneous but there is a fixed processing delay associated with incoming transmissions. We provide explicit closed-form expressions for the optimal decentralized controller under these communication constraints and using standard LQG assumptions for the plants and cost function. Although this problem is convex, it is challenging due to the irrationality of continuous-time delays and the decentralized information-sharing pattern. We show that the optimal subcontrollers each have an observer-regulator architecture containing LTI and FIR blocks and we characterize the signals that subcontrollers should transmit to each other across the network. - Direct synthesis of iterative algorithms with bounds on achievable worst-case convergence rate (doi, arXiv)
L. Lessard and P. Seiler. American Control Conference (ACC), 119–125, Jul 2020. (Denver, CO, virtual)abstract
Iterative first-order methods such as gradient descent and its variants are widely used for solving optimization and machine learning problems. There has been recent interest in analytic or numerically efficient methods for computing worst-case performance bounds for such algorithms, for example over the class of strongly convex loss functions. A popular approach is to assume the algorithm has a fixed size (fixed dimension, or memory) and that its structure is parameterized by one or two hyperparameters, for example a learning rate and a momentum parameter. Then, a Lyapunov function is sought to certify robust stability and subsequent optimization can be performed to find optimal hyperparameter tunings. In the present work, we instead fix the constraints that characterize the loss function and apply techniques from robust control synthesis to directly search over algorithms. This approach yields stronger results than those previously available, since the bounds produced hold over algorithms with an arbitrary, but finite, amount of memory rather than just holding for algorithms with a prescribed structure. - Online data poisoning attacks (url, arXiv)
X. Zhang, X. Zhu, and L. Lessard. Conference on Learning for Dynamics and Control (L4DC), 201–210, Jun 2020. (Berkeley, CA, virtual)abstract
We study data poisoning attacks in the online setting where training items arrive sequentially, and the attacker may perturb the current item to manipulate online learning. Importantly, the attacker has no knowledge of future training items nor the data generating distribution. We formulate online data poisoning attack as a stochastic optimal control problem, and solve it with model predictive control and deep reinforcement learning. We also upper bound the suboptimality suffered by the attacker for not knowing the data generating distribution. Experiments validate our control approach in generating near-optimal attacks on both supervised and unsupervised learning tasks. - Unified necessary and sufficient conditions for the robust stability of interconnected sector-bounded systems (doi, arXiv)
S. Cyrus and L. Lessard. IEEE Conference on Decision and Control (CDC), 7690–7695, Dec 2019. (Nice, France)abstract
Classical conditions for ensuring the robust stability of a linear system in feedback with a sector-bounded nonlinearity include the circle, small gain, passivity, and conicity theorems. In this work, we present a similar stability condition, but expressed in terms of operators on a semi-inner product space. This increased generality leads to a clean result that can be specialized in a variety of ways. First, we show how to recover both sufficient and necessary-and-sufficient versions of the aforementioned classical results. Second, we show that suitably choosing the semi-inner product space leads to a new necessary and sufficient condition for exponential stability with a given convergence rate. Finally, in the spirit of classical robust stability analysis, we provide linear matrix inequalities that allow for the efficient verification of the conditions of our theorem. - Integral quadratic constraints: Exact convergence rates and worst-case trajectories (doi, arXiv)
B. Van Scoy and L. Lessard. IEEE Conference on Decision and Control (CDC), 7677–7682, Dec 2019. (Nice, France)abstract
We consider a linear time-invariant system in discrete time where the state and input signals satisfy a set of integral quadratic constraints (IQCs). Analogous to the autonomous linear systems case, we define a new notion of spectral radius that exactly characterizes stability of this system. In particular, (i) when the spectral radius is less than one, we show that the system is asymptotically stable for all trajectories that satisfy the IQCs, and (ii) when the spectral radius is equal to one, we construct an unstable trajectory that satisfies the IQCs. Furthermore, we connect our new definition of the spectral radius to the existing literature on IQCs. - Explicit agent-level optimal cooperative controllers for dynamically decoupled systems with output feedback (doi, arXiv)
M. Kashyap and L. Lessard. IEEE Conference on Decision and Control (CDC), 8254–8259, Dec 2019. (Nice, France)abstract
We consider a dynamically decoupled network of agents each with a local output-feedback controller. We assume each agent is a node in a directed acyclic graph and the controllers share information along the edges in order to cooperatively optimize a global objective. We develop explicit state-space formulations for the jointly optimal networked controllers that highlight the role of the graph structure. Specifically, we provide generically minimal agent-level implementations of the local controllers along with intuitive interpretations of their states and the information that should be transmitted between them. - A distributed optimization algorithm over time-varying graphs with efficient gradient evaluations (doi, arXiv)
B. Van Scoy and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 357–362, Sep 2019. (Chicago, IL)abstract
We propose an algorithm for distributed optimization over time-varying communication networks. Our algorithm uses an optimized ratio between the number of rounds of communication and gradient evaluations to achieve fast convergence. The iterates converge to the global optimizer at the same rate as centralized gradient descent when measured in terms of the number of gradient evaluations while using the minimum number of communications to do so. Furthermore, the iterates converge at a near-optimal rate when measured in terms of the number of communication rounds. We compare our algorithm with several other known algorithms on a distributed target localization problem. - A canonical form for first-order distributed optimization algorithms (doi, arXiv)
A. Sundararajan, B. Van Scoy, and L. Lessard. American Control Conference (ACC), 4075–4080, Jul 2019. (Philadelphia, PA)abstract
We consider the distributed optimization problem in which a network of agents aims to minimize the average of local functions. To solve this problem, several algorithms have recently been proposed wherein agents perform various combinations of communication with neighbors, local gradient computations, and updates to local state variables. In this paper, we present a canonical form that characterizes any first-order distributed algorithm that can be implemented using a single round of communication and gradient computation per iteration, and where each agent stores up to two state variables. The canonical form features a minimal set of parameters that are both unique and expressive enough to capture any distributed algorithm in this class. The generic nature of our canonical form enables the systematic analysis and design of distributed optimization algorithms. - An optimal control approach to sequential machine teaching (url, arXiv)
L. Lessard, X. Zhang, and X. Zhu. International Conference on Artificial Intelligence and Statistics (AISTATS), 2495–2503, Apr 2019. (Naha, Japan)abstract
Given a sequential learning algorithm and a target model, sequential machine teaching aims to find the shortest training sequence to drive the learning algorithm to the target model. We present the first principled way to find such shortest training sequences. Our key insight is to formulate sequential machine teaching as a time-optimal control problem. This allows us to solve sequential teaching by leveraging key theoretical and computational tools developed over the past 60 years in the optimal control community. Specifically, we study the Pontryagin Maximum Principle, which yields a necessary condition for optimality of a training sequence. We present analytic, structural, and numerical implications of this approach on a case study with a least-squares loss function and gradient descent learner. We compute optimal training sequences for this problem, and although the sequences seem circuitous, we find that they can vastly outperform the best available heuristics for generating training sequences. - Dissipativity theory for accelerating stochastic variance reduction: A unified analysis of SVRG and Katyusha using semidefinite programs (url, arXiv)
B. Hu, S. Wright, and L. Lessard. International Conference on Machine Learning (ICML), 2038–2047, Jul 2018. (Stockholm, Sweden)abstract
Techniques for reducing the variance of gradient estimates used in stochastic programming algorithms for convex finite-sum problems have received a great deal of attention in recent years. By leveraging dissipativity theory from control, we provide a new perspective on two important variance-reduction algorithms: SVRG and its direct accelerated variant Katyusha. Our perspective provides a physically intuitive understanding of the behavior of SVRG-like methods via a principle of energy conservation. The tools discussed here allow us to automate the convergence analysis of SVRG-like methods by capturing their essential properties in small semidefinite programs amenable to standard analysis and computational techniques. Our approach recovers existing convergence results for SVRG and Katyusha and generalizes the theory to alternative parameter choices. We also discuss how our approach complements the linear coupling technique. Our combination of perspectives leads to a better understanding of accelerated variance-reduced stochastic methods for finite-sum problems. - Lyapunov functions for first-order methods: Tight automated convergence guarantees (url, arXiv)
A. Taylor, B. Van Scoy, and L. Lessard. International Conference on Machine Learning (ICML), 4897–4906, Jul 2018. (Stockholm, Sweden)abstract
We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation problems (PEP, due to Drori & Teboulle (2014)) and integral quadratic constraints (IQC, due to Lessard et al. (2016)), and relies on convex interpolation (due to Taylor et al. (2017c;b)). - A robust accelerated optimization algorithm for strongly convex functions (doi, arXiv)
S. Cyrus, B. Hu, B. Van Scoy, and L. Lessard. American Control Conference (ACC), 1376–1381, Jun 2018. (Miwaukee, WI)abstract
This work proposes an accelerated first-order algorithm we call the Robust Momentum Method for optimizing smooth strongly convex functions. The algorithm has a single scalar parameter that can be tuned to trade off robustness to gradient noise versus worst-case convergence rate. At one extreme, the algorithm is faster than Nesterov's Fast Gradient Method by a constant factor but more fragile to noise. At the other extreme, the algorithm reduces to the Gradient Method and is very robust to noise. The algorithm design technique is inspired by methods from classical control theory and the resulting algorithm has a simple analytical form. Algorithm performance is verified on a series of numerical simulations in both noise-free and relative gradient noise cases. - Robust convergence analysis of distributed optimization algorithms (doi)
A. Sundararajan, B. Hu, and L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 1206–1212, Oct 2017. (Monticello IL)abstract
We present a unified framework for analyzing the convergence of distributed optimization algorithms by formulating a semidefinite program (SDP) which can be efficiently solved to bound the linear rate of convergence. Two different SDP formulations are considered. First, we formulate an SDP that depends explicitly on the gossip matrix of the network graph. This result provides bounds that depend explicitly on the graph topology, but the SDP dimension scales with the size of the graph. Second, we formulate an SDP that depends implicitly on the gossip matrix via its spectral gap. This result provides coarser bounds, but yields a small SDP that is independent of graph size. Our approach improves upon existing bounds for the algorithms we analyzed, and numerical simulations reveal that our bounds are likely tight. The efficient and automated nature of our analysis makes it a powerful tool for algorithm selection and tuning, and for the discovery of new algorithms as well. - Dissipativity theory for Nesterov’s accelerated method (url, arXiv)
B. Hu and L. Lessard. International Conference on Machine Learning (ICML), 1549–1557, Aug 2017. (Sydney, Australia)abstract
In this paper, we adapt the control theoretic concept of dissipativity theory to provide a natural understanding of Nesterov’s accelerated method. Our theory ties rigorous convergence rate analysis to the physically intuitive notion of energy dissipation. Moreover, dissipativity allows one to efficiently construct Lyapunov functions (either numerically or analytically) by solving a small semidefinite program. Using novel supply rate functions, we show how to recover known rate bounds for Nesterov’s method and we generalize the approach to certify both linear and sublinear rates in a variety of settings. Finally, we link the continuous-time version of dissipativity to recent works on algorithm analysis that use discretizations of ordinary differential equations. - Control interpretations for first-order optimization methods (doi, arXiv)
B. Hu and L. Lessard. American Control Conference (ACC), 3114–3119, May 2017. (Seattle, WA)abstract
First-order iterative optimization methods play a fundamental role in large scale optimization and machine learning. This paper presents control interpretations for such optimization methods. First, we give loop-shaping interpretations for several existing optimization methods and show that they are composed of basic control elements such as PID and lag compensators. Next, we apply the small gain theorem to draw a connection between the convergence rate analysis of optimization methods and the input-output gain computations of certain complementary sensitivity functions. These connections suggest that standard classical control synthesis tools may be brought to bear on the design of optimization algorithms. - Exponential convergence bounds using integral quadratic constraints (doi, arXiv)
R. Boczar, L. Lessard, and B. Recht. IEEE Conference on Decision and Control (CDC), 7516–7521, Dec 2015. (Osaka, Japan)abstract
The theory of integral quadratic constraints (IQCs) allows verification of stability and gain-bound properties of systems containing nonlinear or uncertain elements. Gain bounds often imply exponential stability, but it can be challenging to compute useful numerical bounds on the exponential decay rate. In this work, we present a modification of the classical IQC results of Megretski and Rantzer that leads to a tractable computational procedure for finding exponential rate certificates. We demonstrate the effectiveness of our method via a numerical example. - A general analysis of the convergence of ADMM (url, arXiv)
R. Nishihara, L. Lessard, B. Recht, A. Packard, and M.I. Jordan. International Conference on Machine Learning (ICML), 343–352, Jul 2015. (Lille, France)abstract
We provide a new proof of the linear convergence of the alternating direction method of multipliers (ADMM) when one of the objective terms is strongly convex. Our proof is based on a framework for analyzing optimization algorithms introduced in Lessard et al. (2014), reducing algorithm convergence to verifying the stability of a dynamical system. This approach generalizes a number of existing results and obviates any assumptions about specific choices of algorithm parameters. On a numerical example, we demonstrate that minimizing the derived bound on the convergence rate provides a practical approach to selecting algorithm parameters for particular ADMM instances. We complement our upper bound by constructing a nearly-matching lower bound on the worst-case rate of convergence. - Structural results for partially nested LQG systems over graphs (doi)
A. Nayyar and L. Lessard. American Control Conference (ACC), 5457–5464, Jul 2015. (Chicago, IL)abstract
We identify a broad class of decentralized output-feedback LQG systems for which the optimal control strategies have a simple and intuitive estimation structure. We consider cases for which the coupling of dynamics among subsystems and the inter-controller communication are characterized by the same directed graph. For the class of graphs known as multitrees, we show that each controller need only estimate the states of the subsystems it affects (its descendants) as well as the subsystems it observes (its ancestors). The optimal control action for each controller is a linear function of the estimate it computes and the estimates computed by its ancestors. Moreover, all state estimates may be updated recursively, much like a Kalman filter. - Performance certification of interconnected nonlinear systems using ADMM (doi)
C. Meissen, L. Lessard, M. Arcak, and A. Packard. IEEE Conference on Decision and Control (CDC), 5131–5136, Dec 2014. (Los Angeles, CA)abstract
We present a compositional performance certification method for interconnected systems, using dissipativity properties of the subsystems along with the interconnection structure. To select the most relevant dissipativity properties, we formulate a large-scale optimization problem, and employ the alternating direction method of multipliers (ADMM) for its solution. The dissipativity properties are allowed to depend on an unknown equilibrium, enabling us to certify performance without explicit knowledge of the equilibrium for the interconnection. The effectiveness of the algorithm is demonstrated on two examples, including a model of vehicle platoons. - State-space solution to a minimum-entropy H-infinity optimal control problem with a nested information constraint (doi, arXiv)
L. Lessard. IEEE Conference on Decision and Control (CDC), 4026–4031, Dec 2014. (Los Angeles, CA)abstract
State-space formulas are derived for the minimum-entropy H-infinity controller when the plant and controller are constrained to be block-lower-triangular. Such a controller exists if and only if: the corresponding unstructured problem has a solution, a certain pair of coupled algebraic Riccati equations admits a mutually stabilizing fixed point, and a pair of spectral radius conditions is met. The controller's observer-based structure is also discussed, and a simple numerical approach for solving the coupled Riccati equations is presented. - Performance certification of interconnected systems using decomposition techniques (doi)
C. Meissen, L. Lessard, and A. Packard. American Control Conference (ACC), 5030–5036, Jun 2014. (Portland, OR)abstract
We propose a unified framework for the certification of stability and input-output performance of interconnected dynamical systems. In our approach, we seek local dissipativity certificates for each subsystem such that when they are combined, the performance of the entire interconnected system is certified. We also demonstrate, by the use of numerical simulations, that the Alternating Direction Method of Multipliers (ADMM) is a promising computational approach for solving such problems. - Structural results and explicit solution for two-player LQG systems on a finite time horizon (doi, arXiv)
L. Lessard and A. Nayyar. IEEE Conference on Decision and Control (CDC), 6542–6549, Dec 2013. (Florence, Italy)abstract
It is well-known that linear dynamical systems with Gaussian noise and quadratic cost (LQG) satisfy a separation principle. Finding the optimal controller amounts to solving separate dual problems; one for control and one for estimation. For the discrete-time finite-horizon case, each problem is a simple forward or backward recursion. In this paper, we consider a generalization of the LQG problem in which there are two controllers. Each controller is responsible for one of two system inputs, but has access to different subsets of the available measurements. Our paper has three main contributions. First, we prove a fundamental structural result: sufficient statistics for the controllers can be expressed as conditional means of the global state. Second, we give explicit state-space formulae for the optimal controller. These formulae are reminiscent of the classical LQG solution with dual forward and backward recursions, but with the important difference that they are intricately coupled. Lastly, we show how these recursions can be solved efficiently, with computational complexity comparable to that of the centralized problem. - A separation principle for decentralized state-feedback optimal control (doi)
L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 528–534, Oct 2013. (Monticello, IL)abstract
A cooperative control problem is considered in which dynamically decoupled subsystems must control their own states through state feedback in order to optimize a global quadratic cost. The states of the subsystems are coupled only through the cost function and correlated external disturbances. The architecture is truly decentralized; no communication between subsystems or their controllers is permitted. The main result of this paper is that the optimal decentralized controller satisfies a new separation principle that is strikingly similar to the celebrated result from centralized optimal control theory, but does not appear to follow from it. Roughly speaking, the optimal decentralized control strategy for each subsystem is the product of a static control gain and a global state estimate, and each can be separately computed. - On structured realizability and stabilizability of linear systems (doi, arXiv)
L. Lessard, M. Kristalny, and A. Rantzer. American Control Conference (ACC), 5694–5700, Jun 2013. (Washington, D.C.)abstract
We study the notion of structured realizability for linear systems dened over graphs. A stabilizable and detectable realization is structured if the state-space matrices inherit the sparsity pattern of the adjacency matrix of the associated graph. In this paper, we demonstrate that not every structured transfer matrix has a structured realization and we reveal the practical meaning of this fact. We also uncover a close connection between the structured realizability of a plant and whether the plant can be stabilized by a structured controller. In particular, we show that a structured stabilizing controller can only exist when the plant admits a structured realization. Finally, we give a parameterization of all structured stabilizing controllers and show that they always have structured realizations. - Decentralized LQG control of systems with a broadcast architecture (doi)
L. Lessard. IEEE Conference on Decision and Control (CDC), 6241–6246, Dec 2012. (Maui, HI)abstract
In this paper, we consider dynamical subsystems interconnected in a broadcast architecture. In the broadcast-out case, the root node can affect several leaf nodes, but the leaf nodes do not affect any other nodes. Each subsystem is locally controlled via output feedback, and the controllers can communicate according to a structure that parallels the dynamic coupling between subsystems. Explicit state-space realizations for the optimal controllers are derived using a spectral factorization approach. An interpretation of the controller states is also provided in terms of optimal state estimators. We also address the dual broadcast-in case, where there is a single leaf node affected by multiple root nodes. - Optimal control of a fully decentralized quadratic regulator (doi)
L. Lessard. Allerton Conference on Communication, Control, and Computing (Allerton), 48–54, Oct 2012. (Monticello, IL)abstract
In this paper, we consider a fully decentralized control problem with two dynamically decoupled agents. The objective is to design a state-feedback controller for each agent such that a global quadratic cost is minimized. No communication, explicit or implicit, is permitted between the agents. However, the performance of the agents is coupled via the cost function as well as the process noise. We provide an explicit state-space construction of the optimal controllers, showing that the optimal controllers are dynamic, where the number of states depends on the joint covariance matrix of the process noise. The key step is a novel decomposition of the noise covariance matrix, which allows the convex program associated with the controller synthesis to be split into simpler problems and thereby solved. - Optimal state-feedback control under sparsity and delay constraints (doi)
A. Lamperski and L. Lessard. IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), 204–209, Sep 2012. (Santa Barbara, CA)abstract
This paper presents the solution to a general decentralized state-feedback problem, in which the plant and controller must satisfy the same combination of delay constraints and sparsity constraints. The control problem is decomposed into independent subproblems, which are solved by dynamic programming. In special cases with only sparsity or only delay constraints, the controller reduces to existing solutions. - Optimal controller synthesis for the decentralized two-player problem with output feedback (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 6314–6321, Jun 2012. (Montréal, Canada)abstract
In this paper, we present a controller synthesis algorithm for a decentralized control problem. We consider an architecture in which there are two interconnected linear subsystems. Both controllers seek to optimize a global quadratic cost, despite having access to different subsets of the available measurements. Many special cases of this problem have previously been solved, most notably the state-feedback case. The generalization to output-feedback is nontrivial, as the classical separation principle does not hold. Herein, we present the first explicit state-space realization for an optimal controller for the general two-player problem. - A state-space solution to the two-player decentralized optimal control problem (doi)
L. Lessard and S. Lall. Allerton Conference on Communication, Control, and Computing (Allerton), 1559–1564, Sep 2011. (Monticello, IL)abstract
In this paper, we present an explicit state-space solution to the two-player decentralized optimal control problem. In this problem, there are two interconnected linear systems that seek to optimize a global quadratic cost. Both controllers perform output feedback, but they have access to different subsets of the available measurements. The optimal controller, which was not previously known, has a state dimension equal to twice the state dimension of the original system. - Quadratic invariance is necessary and sufficient for convexity (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 5360–5362, Jun 2011. (San Francisco, CA)abstract
In decentralized control problems, a standard approach is to specify the set of allowable decentralized controllers as a closed subspace of linear operators. This then induces a corresponding set of of Youla parameters. Previous work has shown that quadratic invariance of the controller set implies that the the set of Youla parameters is convex. In this short note, we prove the converse. We thereby show that the only decentralized control problems for which the set of Youla parameters is convex are those which are quadratically invariant. - An algebraic framework for quadratic invariance (doi)
L. Lessard and S. Lall. IEEE Conference on Decision and Control (CDC), 2698–2703, Dec 2010. (Atlanta, GA)abstract
In this paper, we present a general algebraic framework for analysing decentralized control systems. We consider systems defined by linear fractional functions over a commutative ring. This provides a general algebraic formulation and proof of the main results of quadratic invariance, as well as naturally covering rational multivariable systems, systems with delays, and multidimensional systems. The approach extends to the extended class of internally quadratically invariant systems. - Internal quadratic invariance and decentralized control (doi)
L. Lessard and S. Lall. American Control Conference (ACC), 5596–5601, Jun 2010. (Baltimore, MD)abstract
For decentralized control systems with quadratically invariant information constraints, the set of achievable closed-loop maps is affine, and thus the associated minimum-norm controller synthesis problem is amenable to a convex programming approach. In this paper, we show that a strictly broader class of problems we call internally quadratically invariant, also yields an affine set of achievable closed-loop maps. We treat systems represented by rational as well as proper rational transfer functions and present an illustrative example. - Reduction of decentralized control problems to tractable representations (doi)
L. Lessard and S. Lall. IEEE Conference on Decision and Control (CDC), 1621–1626, Dec 2009. (Shanghai, China)abstract
For decentralized control problems with quadratically invariant information constraints, the optimal controller may be found efficiently. In this paper, we show that there are systems which are not quadratically invariant but reduce to systems that are. We call the requisite property internal quadratic invariance. We present an associated reduction procedure, and illustrate our method with examples.
Doctoral Thesis
- Tractability of complex control systems (url)
L. Lessard. Ph.D. Thesis, Stanford University, Aug 2011.abstract
This thesis is divided into two main parts. In the first part, we consider the problem of efficiently computing wavefront estimates for use in adaptive optics hardware on ground-based telescopes. Our contribution is a warm-started single-iteration multigrid algorithm that performs as well as conventional vector-matrix-multiplication methods, but at a fraction of the computational cost. We used numerical simulations to compare our algorithm to a variety of other published methods, and validated our findings at the Palomar Observatory. In the second part, we consider feedback control subject to an information constraint. Using a novel algebraic framework, we are able to prove many structural results, including a new convexity result, in a natural and purely algebraic way. We also develop a new condition we call internal quadratic invariance, under which the controller synthesis can be cast as a convex optimization problem. This describes the most general class of tractable decentralized control problems known to date. Both parts of the thesis fit into the broader question of tractability of complex systems. In the first part we look at a practical example which is difficult because of the large number of sensors and actuators. In the second part, we look at decentralized control, which is difficult because of the non-classical information constraint.