Monday, September 18 |
07:30 - 08:45 |
Breakfast (Restaurant at your assigned hotel) |
08:45 - 09:00 |
Introduction and Welcome (Conference Room San Felipe) |
09:00 - 09:35 |
Hedy Attouch: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case \alpha \leq 3 ↓( Based on a joint work with Z. Chbani and H. Riahi)
In a Hilbert space setting \mathcal{H}, given \Phi: \mathcal H \to \mathbb R a convex continuously differentiable function, and \alpha a positive parameter, we first study the asymptotic behavior of the inertial system with Asymptotic Vanishing Damping
\mbox{(AVD)}_{\alpha} \quad \quad \ddot{x}(t) + \frac{\alpha}{t} \dot{x}(t) + \nabla \Phi (x(t)) =0,
and then the associated inertial algorithms.
Depending on the value of \alpha with respect to 3, and based on new Lyapunov functions, we give a complete picture of the convergence properties as t \to + \infty of the trajectories generated by \mbox{(AVD)}_{\alpha}.
As shown by Su-Boyd-Candès, the case \alpha = 3 corresponds to a continuous version of the accelerated gradient method of Nesterov, with the convergence rate
\Phi (x(t))-\min \Phi = \mathcal O (t^{-2}).
Our main result concerns the subcritical case \alpha \leq 3, where we show that
\Phi (x(t))-\min \Phi = \mathcal O (t^{-\frac{2}{3}\alpha}).
When \alpha > 3, we find the recent results by May and A-Chbani-Peypouquet-Redont concerning the weak convergence of the trajectories, and the convergence of the values with the order o\left(t^{-2} \right).
This overall picture shows a continuous variation of the rate of convergence of the values \Phi(x(t))-\min_\mathcal H \Phi= \mathcal O (t^{-p(\alpha)}) with respect to \alpha >0: the coefficient p(\alpha) increases linearly up to 2 when \alpha goes from 0 to 3, then displays a plateau.
We also consider the convergence of trajectories in the critical case \alpha = 3, with a positive response in some particular cases.
Then we consider structured convex minimization problems of the form \min \left\lbrace \Theta:= \Phi + \Psi \right\rbrace, with \Phi smooth and \Psi nonsmooth. As a major property, the Lyapunov analysis of the continuous dynamics serves as a guideline for the study of the associated forward-backward inertial algorithms. We obtain a similar convergence rate for the sequence of iterates (x_k): for \alpha < 3 we have \Theta (x_k)-\min \Theta = \mathcal O (k^{-p}) for all p <\frac{2\alpha}{3}, for \alpha = 3 \ \Theta (x_k)-\min \Theta = \mathcal O (k^{-2}) (FISTA, Beck-Teboulle, 2009), and for \alpha > 3 \ \Theta (x_k)-\min \Theta = o (k^{-2}) (A-Peypouquet, 2016).
We conclude this study by showing that the results are robust with respect to external perturbations.
- [1]
- H. Attouch, Z. Chbani, J. Peypouquet, P. Redont, Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping, to appear in Math. Program. DOI: 10.1007/s10107-016-0992-8.
- [2]
- H. Attouch, J. Peypouquet,
The rate of convergence of Nesterov's accelerated forward-backward method is actually faster than \frac{1}{k^2}, SIAM J. Optim., 26 (2016), No. 3, pp. 1824-1834.
- [3]
- A. Beck, M. Teboulle,
A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2 (2009), No. 1, pp. 183-202.
- [4]
- A. Chambolle, C. Dossal,
On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm”, Journal of Optimization Theory and Applications, 166 (2015), pp. 968-982
- [5]
- Y. Nesterov,
Introductory lectures on convex optimization: A basic course, volume 87 of Applied Optimization. Kluwer Academic Publishers, Boston, MA, 2004
(Conference Room San Felipe) |
09:35 - 10:10 |
James Burke: Iteratively re-weighted lest squares and ADMM methods for solving affine inclusions ↓ We describe two matrix-free methods for solving
large-scale affine inclusion problems on the product (or intersection) of convex sets.
The first approach is a novel iterative re-weighting algorithm (IRWA) that iteratively minimizes quadratic models of relaxed subproblems while automatically updating a relaxation vector. The second approach is based on alternating direction augmented Lagrangian (ADAL) technology. The main computational costs of each algorithm are the repeated minimizations of convex quadratic functions which can be performed matrix-free. Both algorithms are globally convergent under loose assumptions, and each requires at most O(1/\varepsilon^2) iterations to reach \varepsilon-optimality of the objective function.
Numerical experiments show that both algorithms efficiently find inexact solutions.
However, in certain cases, these experiments indicate that IRWA can be significantly more efficient than ADAL. (Conference Room San Felipe) |
10:10 - 10:45 |
Aris Daniilidis: On the Glaeser-Whitney extension problem ↓ We give a simple proof and explicit formulae for the C^{1,1}-convex extension problem
studied by D. Azagra and C. Mudarra. As an application, we obtain an easy constructive proof
for the Glaeser-Whitney problem of C^{1,1} extensions on a Hilbert space.
This is a joint work with M. Haddou, O. Ley and E. Le Gruyer. (Conference Room San Felipe) |
10:45 - 11:15 |
Coffee Break (Conference Room San Felipe) |
11:15 - 11:50 |
Yura Malitsky: Golden Ratio Algorithms for Variational Inequalities ↓ We present several novel methods for solving general (pseudo-) monotone variational inequalities. The first method uses fixed stepsize and is similar to the proximal reflected gradient method: it also requires only one value of operator and one prox-operator per iteration. However, its extension — the dynamic version — has a notable distinction. In every iteration it defines a stepsize, based on a local information about operator, without running any linesearch procedure. Thus, the iteration costs of this method is almost the same as in the first one with a fixed stepsize, but it converges without the Lipschitz assumption on the operator. We further discuss possible generalizations of the methods, in particular for solving large-scale nonlinear saddle point problems. Some numerical experiments are reported. (Conference Room San Felipe) |
11:50 - 12:25 |
Robert Csetnek: ADMM for monotone operators: convergence analysis and rates ↓ We propose a unifying scheme for several algorithms from the
literature dedicated to the solving of monotone inclusion problems
involving compositions with linear continuous operators in infinite
dimensional Hilbert spaces. We show that a number of primal-dual
algorithms for monotone inclusions and also the classical ADMM numerical
scheme for convex optimization problems, along with some of its
variants, can be embedded in this unifying scheme. While in the first
part of the talk convergence results for the iterates are reported, the
second part is devoted to the derivation of convergence rates obtained
by combining variable metric techniques with strategies based on a
suitable choice of dynamical step sizes. The talk is based on a joint
work with Radu Bot. (Conference Room San Felipe) |
12:25 - 13:00 |
Scott Lindstrom: Douglas-Rachford Method for Non-Convex Feasibility Problems ↓ The Douglas-Rachford method has been employed successfully to solve a variety of non-convex feasibility problems. In particular, it shows surprising stability when applied to finding the intersections of hypersurfaces. We prove local convergence in the generalization of a case prototypical of the phase retrieval problem. In so doing, we also discover phenomena which may inhibit convergence. Finally we illustrate an application to solving boundary valued ordinary differential equations.
This talk includes discoveries from three closely related works:
1. With Brailey Sims, Matthew Skerritt. ''Computing Intersections of Implicitly Specified Plane Curves.'' To appear in Journal of Nonlinear and Convex Analysis.
2. With Jonathan M. Borwein, Brailey Sims, Anna Schneider, Matthew Skerritt. ''Dynamics of the Douglas-Rachford Method for Ellipses and p-Spheres.'' Submitted to Set Valued and Variational Analysis.
3. With Bishnu Lamichhane and Brailey Sims. ''Application of Projection Algorithms to Differential Equations: Boundary Value Problems,'' in preparation with plans to submit to ANZIAM Journal. (Conference Room San Felipe) |
13:20 - 13:30 |
Group Photo (Hotel Hacienda Los Laureles) |
13:30 - 15:00 |
Lunch (Restaurant Hotel Hacienda Los Laureles) |
15:00 - 15:35 |
Asen Dontchev: The Inverse Function Theorems of Lawrence Graves ↓ The classical inverse/implicit function theorems revolves around solving an
equation in terms of a parameter and tell us
when the solution mapping associated with this equation
is a (differentiable) function.
Already in 1927 Hildebrandt and Graves observed that one can put aside differentiability
obtaining that the solution mapping is just Lipschitz continuous.
The idea has evolved in subsequent
extensions most known of which are various reincarnations of the
Lyusternik-Graves theorem. In the last several decades
it has been widely accepted that in order to derive estimates for the solution mapping
and put them in use for proving convergence of algorithms,
it is sufficient to differentiate what you can and leave the rest as is, hoping that the resulting problem
is easier to handle. More sophisticated results have
been obtained by employing various forms of metric regularity
of mappings acting in metric spaces, aiming at
applications to numerical analysis. (Conference Room San Felipe) |
15:35 - 16:10 |
Elena Resmerita: Reconstruction of positive solutions for ill-posed problems ↓ This talk reviews several classes of methods for reconstructing positive solutions of inverse problems mostly by means of the Shannon entropy. While the literature on the finite dimensional setting is very rich, we focus on methods for problems in infinite dimensional Banach spaces and point out challenges raised in this context. (Conference Room San Felipe) |
16:10 - 16:30 |
Coffee Break (Conference Room San Felipe) |
16:30 - 17:05 |
Anthony Man-Cho So: A Unified Approach to Error Bounds for Structured Convex Optimization Problems ↓ In this talk, we present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convex function. Such a class encapsulates fairly general constrained minimization problems, as well as various regularized loss minimization formulations in machine learning, signal processing, and statistics. Our framework not only allows us to recover a number of existing error bound results in a unified and transparent manner but also leads to new error bounds for problems whose objective functions do not necessarily have a polyhedral epigraph. As an illustration, we show that a class of nuclear-norm regularized loss minimization problems admits an error bound under a strict complementarity-type regularity condition. We then complement this result by constructing an example to show that the said error bound could fail to hold without the regularity condition. If time permits, we discuss the algorithmic implications of our results, particularly on the convergence rate analysis of a family of successive quadratic approximation methods for solving the aforementioned class of problems. (Conference Room San Felipe) |
17:05 - 17:40 |
Panos Patrinos: Accelerated Douglas-Rachford splitting and ADMM for structured nonconvex optimization (joint work with Andreas Themelis, Lorenzo Stella) ↓ Although originally designed and analyzed for solving convex problems, the Alternating Direction Method of Multipliers (ADMM) and its close relatives, Douglas-Rachford Splitting (DRS) and Peaceman-Rachford Splitting (PRS), have been observed to perform remarkably well when applied to certain classes of structured nonconvex optimization problems. However, partial global convergence results in the nonconvex setting have only recently emerged. The goal of this work is two-fold: First, we show how the Douglas-Rachford Envelope (DRE), introduced by the authors in 2014, can be employed to devise global convergence guarantees for ADMM, DRS, PRS applied to nonconvex problems under less restrictive conditions, larger prox stepsizes and over-relaxation parameters than what was previously known. Furthermore, exploiting properties of the DRE, we propose a new globally convergent algorithmic scheme that greatly accelerates convergence. For instance, the new scheme can accommodate for quasi-Newton directions such as (L-) BFGS. In fact, under mild assumptions, the algorithm is shown to converge superlinearly by performing exactly the same kind of computations as in ADMM, DRS, PRS. (Conference Room San Felipe) |
17:40 - 18:15 |
Genaro Lopez: Modulus of regularity and rate of convergence for a Fejer monotone sequence ↓ Many problems in applied mathematics can be brought into the following format:
Let (X,d) be a metric space and F:X\to \mathbb{R} be a function: find a zero z\in X of F.
This statement covers many equilibrium,
fixed points and minimization problems.
Numerical methods, e.g. based on suitable iterative techniques,
usually yield sequences (x_n) in X of approximate zeros, i.e.
|F(x_n)|< 1/n. Based on extra assumptions (e.g. the compactness of
X, the Fejér monotonicity of (x_n) and the continuity of F)
one then shows that (x_n)
converges to an actual zero z of F.
An obvious question then concerns the speed
of the convergence of (x_n) towards z and whether there is an effective
rate of convergence.
Even though sometimes left implicit, the effectivity of iterative
procedures in the case of unique zeros rests on the existence of an effective so-called modulus of uniqueness, see [Kohlenbach].
In this talk, we are concerned with a generalization of the concept of
"modulus of uniqueness", called "modulus of regularity" which is applicable also in the non-unique case.
While the concept of a modulus of regularity
has been used in various special situations before (see e.g. [Anderson] and the literature cited there),
we develop it here
as a general tool towards a unified treatment of a number of concepts studied in convex optimization such as
metric subregularity, H\"older regularity, weak sharp minima etc. which can be seen as instances of the concept
of regularity w.r.t. \text{zer}\;F for suitable choices of F.
This talk is based on a still in progress joint work with A. Nicolae and U. Kohlenbach.
[Anderson]
R.M. Anderson, "Almost" implies "Near", Trans. Amer. Math. Soc. 296 (1986), 229-237.
[Kohlenbach]
U. Kohlenbach,
Effective moduli from ineffective uniqueness proofs. An unwinding of de La Vallée Poussin's proof for Chebycheff approximation,
Ann. Pure Appl. Logic 64 (1993), 27-94. (Conference Room San Felipe) |
19:00 - 21:00 |
Dinner (Restaurant Hotel Hacienda Los Laureles) |