Many problems in control, engineering, management, and machine learning can be cast as an optimization problem: minimizing an objective function subject to constraints. An optimization algorithm is an iterative procedure that returns a sequence of candidate points. The algorithm is convergent if this sequence converges to a local optimum when starting at any possible initial configuration.

Minimizing the sum of a quadratic and an L1 norm. All executions converge to the unique optimal solution (black circle).

Optimization algorithms can be understood through a system-theoretic lens as the interconnection of a linear system and a set of memoryless nonlinearities (algodynamics/system theory of algorithms). The linear system governs the memory and stepsize rules used by the algorithm, and the nonlinearities are the gradient/subgradient/operator oracles defining the optimization problem.

Algorithms to minimize the sum of two functions. Left: Projected Gradient Descent algorithm. Right: Douglas-Rachford.

Image denoising can be classically accomplished by solving an optimization problem. The optimization includes the sum of three objectives: a least-squares data-fidelity term, a nonsmooth total-variational regularization term to penalize pixel fluctuation, and a constraint that all RGB pixels must take values between 0 and 255.

Left: Clean image. Center: Noisy image. Right: Denoised image by Douglas-Rachford.

Algorithms may operate in dynamic environments, in which the oracles are not immediately accessible by the optimizer. Dynamical network effects include time-delays, cross-talk, channel memory, switching, and noise corruption. Desired algorithmic structures can also be modeled as components of a network (algo-dynamics), such as in primal-dual updates arising in distributed optimization. The introduction of these dynamical components can degrade or even break nominally convergent algorithms.

Delays are introduced in reading the image data. Columns: increasing delays, Rows: increasing number of iterations. Douglas-Rachford optimally tuned for the no-delay case is broken by delays.

Our investigation targets the following questions:

When does a composite optimization algorithm converge under dynamical network effects?
What is an upper-bound of the convergence rate of a given algorithm?
How can we principally design optimization algorithms without requiring hand-tuning?

Structure

We first provide conditions for an optimization algorithm to converge, by forming a link from convergence to the control-theoretic concepts of Regulation Theory and the Internal Model Principle.

Any well-posed algorithm that converges for problems with unique optimal solutions and subgradient vectors must satisfy fundamental properties. The first condition is Robust Stability: if the optimal solution and subgradient vectors at optimality are all 0, then the state should converge to 0 regardless of the initial condition. The second condition is the satisfaction of a Regulator Equation: consensus between oracles will be asymptotically achieved even if the state converges to a nonzero quantity. We show that the Regulator Equation depends only on the network, and is independent of the precise oracles being used. In the language of control, these conditions are a structured robust regulation result using arguments from nominal regulation. The Robust Stability and Regulator Equation requirements are necessary and sufficient for algorithm convergence with linear-time-invariant algorithms (e.g. gradient descent with fixed stepsizes), and are sufficient for switched systems.

Satisfaction of Robust Stability and Regulator Equation implies that convergent algorithms must follow an Internal Model Principle: there exists a factorization of the algorithm into a Model system that depends only on the network and a Core subsystem that carries the algorithm parameters. For algorithms with no network dynamics, the model is dependent only on the number of oracles and the dimension of each oracle.

Factorization of Douglas-Rachford into Core (bottom-left) and Model (bottom-right)

Analysis and Synthesis

The system theoretic model of allows for the analysis and synthesis of optimization algorithms by using methods from robust control.

The oracles can be abstracted into uncertainties constrained by known properties, such as strong convexity and smoothness. Input-output sequences arising from the operators satisfy a family of Integral Quadratic Constraints/dissipation relations. These relations cover the operator sequences: a linear system that is convergent with respect to any possible uncertainty satisfying these relations is therefore convergent when interconnected with the operators.

The analysis problem involves first checking the Regulator Equation, and then finding relations satisfied by the operator sequences such that the interconnection has minimal exponential convergence rate. The synthesis problem additionally finds a controller such that the interconnection of the controller and the network forms an convergent optimization algorithm. Synthesis is performed by alternating between searching for the controller and the relations. An internal model is supplied to ensure that the Regulator Equation is satisfied for any controller. The analysis and individual synthesis tasks are all convex programs that are posed in terms of Linear Matrix Inequalities.

Relevant Publications:

Preprints

Structure, Analysis, and Synthesis of First-Order Algorithms

Miller, Jared, Scherer, Carsten, Jakob, Fabian, and Iannelli, Andrea

Preprint 2026

Abs arXiv Bib PDF Code Slides

Optimization algorithms can be interpreted through the lens of dynamical systems as the interconnection of linear systems and a set of subgradient nonlinearities. This dynamical systems formulation allows for the analysis and synthesis of optimization algorithms by solving robust control problems. In this work, we use the celebrated internal model principle in control theory to structurally factorize convergent composite optimization algorithms into suitable network-dependent internal models and core subcontrollers. As the key benefit, we reveal that this permits us to synthesize optimization algorithms even if information is transmitted over networks featuring dynamical phenomena such as time delays, channel memory, or crosstalk. Design of these algorithms is achieved under bisection in the exponential convergence rate either through a nonconvex local search or by alternation of convex semidefinite programs. We demonstrate factorization of existing optimization algorithms and the automated synthesis of new optimization algorithms in the networked setting.
@preprint{miller2026structureanalysissynthesisfirstorder, title = {Structure, Analysis, and Synthesis of First-Order Algorithms}, author = {Miller, Jared and Scherer, Carsten and Jakob, Fabian and Iannelli, Andrea}, year = {2026}, arxiv = {2603.24795}, tag = {opt}, selected = {true}, pdf = {structure_optimization.pdf}, slides = {Presentation__Structure__Analysis__and_Synthesis_of_Optimization_Algorithms__Oxford.pdf}, archiveprefix = {arXiv}, primaryclass = {math.OC}, code = {https://github.com/Jarmill/constrained_opt}, url = {https://arxiv.org/abs/2603.24795}, bibtex_show = {true} }
Analysis and Synthesis of Switched Optimization Algorithms

Miller, Jared, Jakob, Fabian, Scherer, Carsten, and Iannelli, Andrea

Preprint 2025

Abs arXiv Code

Deployment of optimization algorithms on networked systems face challenges associated with time delays and corruptions. One particular instance is the presence of time-varying delays arising from factors such as packet drops and irregular sampling. Fixed time delays can destabilize gradient descent algorithms, and this degradation is exacerbated by time-varying delays. This work concentrates on the analysis and creation of discrete-time optimization algorithms with certified exponential convergence rates that are robust against switched uncertainties between the optimizer and the gradient oracle. These optimization algorithms are implemented by a switch-scheduled output feedback controllers. Rate variation and sawtooth behavior (packet drops) in time-varying delays can be imposed through constraining switching sequences. Analysis is accomplished by bisection in the convergence rate to find Zames-Falb filter coefficents. Synthesis is performed by alternating between a filter coefficient search for a fixed controller, and a controller search for fixed multipliers.