INFORMS’19 in Seattle

I attended the 2019 INFORMS Annual Meeting in Seattle, WA on Oct 20-23, 2019. INFORMS is the leading international association for Operations Research & Analytics professionals. It’s outside my wheelhouse, but I thought it would be nice to attend since Operations Research does touch on optimization, modeling, and data-driven decision-making, which are topics that interest me.

This was my first time attending an INFORMS conference and I was blown away by the sheer size and scope. There were well over 6,000 attendees and the conference had 100 simultaneous tracks over 4 full days. Compare this to CDC (the flagship annual controls conference), which is about 1/4 of the size.

Continue reading

Allerton’19

I attended the 57th annual Allerton Conference on Communication, Control, and Computing in Monticello, IL. At the conference, I presented an invited talk in a session organized by Bin Hu, my former postdoc who is now at UIUC — so it was nice to see Bin again!

My talk was largely about the following recent paper (arxiv link) by my student Akhil Sundararajan, my postdoc Bryan Van Scoy, and me. The manuscript should get published at some point in 2020 once we make our final revisions. The paper describes how gradient-based distributed optimization algorithms can be parameterized and analyzed (in terms of their worst-case convergence rate) using simple semidefinite programs. In the paper, we study the case where the underlying graph is possibly time-varying (with uniformly bounded spectral gap), and we characterize the worst-case performance under any adversarially-varying graph. The benefit of this approach is that it enables rapid analysis, which means it also enables efficient design! Our paper also presents an “optimal” algorithm design (optimal in the sense that it has the fastest worst-case convergence rate) and a comparison between this algorithm and existing algorithms from the literature. Finally, we show that our approach likely produces tight bounds because we can construct worst-case sequences of graphs and functions that achieve the bound.

NecSys’19 in Chicago


I attended the 8th IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys) in Chicago on September 16-17, 2019. NecSys is a small international single-track conference that tends to attract optimization/controls/distributed systems researchers. The last time I attended NecSys was in 2012 (in Santa Barbara) while I was still a postdoc, and I have fond memories of it. It was nice for NecSys to be local again this year, as it meant that I could bring most of my research group along!

At the conference, Bryan presented a poster on his recent distributed optimization work (arxiv link). In this work, Bryan showed that near-optimal gradient-based distributed optimization algorithms can be designed by alternating between simple gradient steps and gossip steps. In the paper, Bryan shows how to tune the ratio of gradient to gossip steps based on the spectral gap of the adjacency matrix of the underlying communication graph (a measure of how “connected” the graph is) in order to achieve the fastest possible convergence rate.

On a personal note, I had the opportunity at NecSys to reconnect with my former PhD advisor Sanjay Lall and my former postdoc advisor Ben Recht (first photo). The fact that NecSys was local also led to a little road trip back to Madison with my students Mruganka, Saman, and Akhil! We stopped for dinner in “Little India”, which is a neighborhood filled with Indian and Pakistani restaurants and shops in a suburb north of Chicago. Akhil guided us to some traditional Chaat (second photo) and it was delicious!

ACC’19 in Philadelphia



I attended the 2019 American Control Conference in Philadelphia, Pennsylvania. The conference was held in downtown Philadelphia, which is filled with beautiful historical buildings, such as the Union League building, pictured on the right.

This was a conference of firsts. Together with my former postdoc Bin Hu, who is now faculty at UIUC, I co-organized my first ACC workshop! Here is a link to the workshop website. It was a full-day workshop on the interplay between optimization, control, and machine learning. Our morning session focused on how tools from controls, differential equations, and dynamical systems could be used to better analyze and design first-order optimization algorithms, which are a key tool in machine learning. Our afternoon session focused on how ideas from machine learning, reinforcement learning, and complexity theory could be used to address large-scale and computationally challenging control problems. We were lucky to have many great speakers, and our workshop was the most popular one at ACC this year — check out the full room on the right! If you’re interested in these topics, the workshop website linked contains info for all the talks that took place (authors, titles, and slides!).

In addition to the workshop, my student Akhil Sundararajan and postdoc Bryan Van Scoy joined me at ACC and Akhil presented a paper by the three of us titled: “A canonical form for first-order distributed optimization algorithms”. In distributed optimization, the goal is for a number of agents (computer servers, robots, mobile phones, etc.) to jointly solve an optimization problem. The problem is global, but the data collection and processing happens locally. The challenge is to design an algorithm that mediates how each agent processes its local data and what is communicated with the other agents so that each agent eventually learns the solution to the global problem. There have been many such algorithms proposed in recent years, but little insight on what makes one algorithm better than another, or how to go about actually designing such an algorithm. To make matters more confusing, there are many equivalent sets of equations that can be used to describe the same algorithm. So it might happen (and it has!) that two researchers invent the same algorithm without realizing it. Our paper makes an effort to classify distributed algorithms, by providing a canonical form that uses as few parameters as possible yet parameterizes a broad class of algorithms (including many existing published algorithms). Our unifying parameterization is a first step toward a more systematic and principled analysis and design of distributed optimization algorithms.

My student Akhil, who is also first author on the paper, presented our work. As a side note, this was Akhil’s first conference presentation, and he did a fantastic job! (see photo on the right). If you’re interested in seeing Akhil’s slides, you can download them here.

L4DC’19 at MIT

I had the pleasure to attend the inaugural Conference on Learning for Dynamics and Control at MIT. The goal of this new conference is to bring together researchers at the interface between machine learning, control, optimization, and related areas, with a focus on addressing scientific and application challenges in real-time physical processes modeled by dynamical or control systems. The conference format is still in flux, but this year’s version was a single-track format with diverse high-profile speakers presenting on a range of topics including: model-predictive control, deep learning, simulating dynamical systems, robotics, traffic systems, and more. There was also two excellent poster sessions. Overall, the conference was a huge success. Over 400 people attended, and all on very short notice! Only about 3 months elapsed between when the idea of this conference was conceived and when the conference actually occurred.

My two highlights of the conference were: (1) Getting to see all the Berkeley folks I knew through my postdoc but haven’t had much of a chance to interact with since then. Many any of them are now moving on to exciting careers in academia and industry. (2) Having the opportunity to meet Alexandre Megretski for the first time and chat about robust control and IQCs. Megretski wrote some of the seminal work in these areas and it was an honor to meet him in person.

Kudos to the organizers (Ali Jadbabaie, George Pappas, Pablo Parrilo, Ben Recht, and Melanie Zellinger) for putting together such an amazing conference on such short notice! I’m looking forward to attending again next year!

8th Midwest Workshop on Control and Game Theory

I recently attended the 8th Midwest Workshop on Control and Game Theory hosted by the University of Washington in St. Louis. The two-day workshop brought together faculty, postdocs, and grad students in the fields of controls, optimization, game theory, and economics. This was my first time visiting Wash U, and it’s a beautiful campus! The buildings are all built of sandstone, and it reminds me of Queen’s University in Kingston, Canada.

I quite like smaller workshop formats such as this one, so much less hectic than the large conferences I attend like CDC or ICML. It’s also nice to meet and network with other researchers in the Midwest!

At the workshop, I have a 30-minute talk on using a robust controls perspective to analyzing and designing optimization algorithms. The video of my talk is available on YouTube.

CDC’18 in Miami Beach

I attended the 2018 Conference on Decision and Control in Miami Beach, Florida. A welcome change of weather from the looming winter in Madison! The photo above was taken from the terrace of the conference venue.

At the conference, I took part in the Workshop on the Intersections of Machine Learning and Parameter Estimation in Control organized by Travis Gibson, Joseph Gaudio, and Anuradha Annaswamy of MIT. This general field at the intersection of machine learning, optimization, and control seems to be gaining momentum. Indeed, there was another workshop on very similar topics that was held concurrently and it was a shame I couldn’t attend both!

I’ll point out two highlights of the conference. Besides attending sessions on robust control, distributed optimization, decentralized control, and machine learning, I also strayed outside my comfort zone and attended a tutorial session on “Control-Theoretic Methods for Biological Networks”. A growing group of controls researchers have taken interest in biological systems, and I have to say I was both impressed at the progress in this field and humbled at how much there is left to do. Biological systems are incredibly complex and it seems the only way to advance the ball (and to be taken seriously) in this area is to work at the problem from both sides: performing biological experiments in a wet-lab environment and doing some serious mathematical modeling and analysis. I thought Hana El-Samad (UCSF) and Eduardo Sontag (Northeastern) both gave very insightful presentations.


The second highlight was a special session commemorating the life and contributions of Bruce Francis to the field of robust control. Bruce passed away in March of this year and it was nice to see so many people (standing room only) come to celebrate his life and accomplishments at CDC. Bruce’s (arguably) biggest contribution was in developing a state-space solution for the H-infinity control problem. The famous “DGKF” paper is among the most highly-cited papers in the field. The other three co-authors (Doyle, Glover, and Khargonekar) were all in attendance and shared some heartfelt words. The first photo on the right was taken during the CDC special session; John Doyle was recounting some of the initial (mixed) reactions to the DGKF paper, including some choice quotes (see the slide!). The first (and last) time I met Bruce Francis was at the Caltech CDS20 workshop five years ago. Incidentally, all four authors of DGKF were present at that event (see second photo on the right — authors are in the DGKF order!). Although Bruce and I had just met, we bonded over the fact that we are both Canadian, and we both enjoy playing Frisbee! Bruce was also a Professor at the University of Toronto, which is where I did my undergrad. Bruce was soft-spoken, kind, and humble — sentiments echoed by many of his colleagues at CDC this year. He will be missed.

This year’s Bode lecture was delivered by Mark Spong of UT Dallas. The topic was robotics, an area that is not really in my wheelhouse, but I thought Mark did a wonderful job of explaining some basic tools and challenges from the field. What amazed me from the presentation was how simple passivity-based control can be used to create very natural-looking movement patterns that are robust to disturbances. Mark showed examples of a self-righting inverted pendulum and various walking robots, all of which were designed using energy conservation principles. If you’re looking for a well-delivered introductory lecture on robotics, I highly recommend this one! The lecture was recorded and should be posted online here soon.

New preprint: Machine teaching!

Hot off the presses, a new paper by me, Xuezhou Zhang, and Xiaojin (Jerry) Zhu. The paper is titled “An Optimal Control Approach to Sequential Machine Teaching” and is now available on arXiv.

The paper is about “machine teaching”, which can be thought of as the dual problem to “machine learning”. The general paradigm is that we are trying to learn a model (described by a set of parameters) that characterizes some process. We observe data samples drawn from this process and the learning algorithm uses the data to refine its model parameters in an iterative fashion. Machine learning is the forward problem: using the given data to update the parameters. Different choices of algorithms can lead to different learning rates, for example. Machine teaching is the scenario where you know the model you’re trying to learn and you know the learner’s algorithm. Your task is to select the sequence of data samples that leads to the fastest possible learning.

Machine teaching comes up when investigating issues of security and robustness of machine learning algorithms. For example: how susceptible is our algorithm to a “data poisoning attack”, where false data is injected to steer our algorithm off course? Machine teaching also has potential use in designing better curricula for classroom teaching tasks.

In this paper, we apply optimal control theory to the simple case of an SGD learner with a squared loss and training examples that are constrained to satisfy a norm bound. We prove that there always exists an optimal trajectory through parameter space that lies in a 2D subspace, regardless of the ambient dimension of the parameter space. In other words, optimal teaching trajectories can be simple even when there are millions of parameters to learn. Also, we show through numerical experiments that the optimal teaching strategy can be quite counter-intuitive in many cases — optimal trajectories often take a circuitous route through parameter space and yet can perform arbitrarily better than obvious heuristics such as a “one-step greedy approach” or a “straight line” approach.

Princeton Day of Optimization 2018

I attended the inaugural Princeton Day of Optimization hosted by the Princeton Operations Research and Financial Engineering (ORFE). The theme of the day was “The intersection of optimization, control, and machine learning,” and the event did not disappoint! There were six keynote talks on a variety of topics related to the theme and a poster session over lunch showcasing the works of many of the graduate student attendees. The event was organized by my colleague Amir Ali Ahmadi and I must say he did a fantastic job. Putting together an event like this is a tremendous amount of work and everything happened smoothly. The Princeton Day of Optimization will be held every two years and I’m looking forward to the next installment! This was my first time seeing the Princeton campus and I took the opportunity to walk around and admire some of the beautiful historic buildings on the campus. Pictured right is the Princeton University Chapel.

New preprint: A canonical form for distributed optimization algorithms

We have just posted a preprint titled: “A Canonical Form for First-Order Distributed Optimization Algorithms”. It is available here and on arXiv. This is joint work with my student Akhil Sundararajan, my postdoc Bryan Van Scoy, and myself. Our work provides a canonical form for distributed optimization algorithms that will enable more streamlined analysis and design.

There has been much recent interest in the analysis and design of distributed optimization algorithms. In this context, “distributed” means that the computation happens on different computers across a network. Each computing node may have access to different local information. The goal is for each node to arrive at the solution of a global optimization problem that involves everybody’s data, whilst each node only performs local computation and communicates only with nearby nodes. All of this should happen without ever needing to store all the data in one place. This paradigm makes sense in cases where the individual nodes are remote and have limited computing capability, or we are trying to conserve power by transmitting and computing as little as possible. Also, we may not want a single node to gather and store all the available information for privacy or security reasons.

In the past 5 years, other researchers have proposed a variety of solutions to this problem. Proposed methods involve local computation, local memory updates, and communication with nearby nodes, in some prescribed order. Because there is a number of different ways to perform these actions, there is a combinatorial explosion of possible designs, analyses, and outcomes.

Our paper provides a way of streamlining research efforts in this area: we showed that in fact, there aren’t that many different possible distributed optimization algorithms. We were able to parameterize a broad sub-family of algorithms using only 5 scalar parameters. In doing so, we were able to neatly categorize existing methods and in some cases, show that different established algorithms were in fact the same algorithm!

Our hope is that this work will help streamline research efforts in distributed optimization in moving toward a more principled understanding of analysis and design.