凸优化交替方向乘子法

来源：互联网发布：菲诗小铺的洗发水知乎编辑：程序博客网时间：2024/06/05 00:22

原文在这里：http://blog.csdn.net/shanglianlm/article/details/46808793

最近开始对凸优化(convex optimization)中的ADMM(Alternating Direction Method of Multipliers)交替方向乘子算法开始感兴趣，接下来我会写一系列关于ADMM(Alternating Direction Method of Multipliers)交替方向乘子算法的内容。

凸优化：ADMM(Alternating Direction Method of Multipliers)交替方向乘子算法系列之三：ADMM

3- 交替方向乘子算法（Alternating Direction Method of Multipliers）

如前文所述，ADMM是一个旨在将对偶上升法的可分解性和乘子法的上界收敛属性融合在一起的算法。

3-1 算法（Algorithm）

设有如下优化问题：

min f (x) + g (z) s.t. A x + B z = c (3.1)

如同乘子法中一样，我们获得它的增广拉格朗日形式为：

L ρ (x, z, λ) = f (x) + g (z) + y T (A x + B z - c) + (ρ / 2) | | A x + B z - c | | 22

那么它的迭代方式为：

x k + 1 = argmin x L ρ (x, z k, λ k) (3.2)

z k + 1 = argmin z L ρ (x k + 1, z, λ k) (3.3)

λ k + 1 = λ k + ρ (A x k + 1 + B z k + 1 - c) (3.4)

其中增广拉格朗日参数

ρ>0。

乘子法关于（3.1）的求解

它要求对两个原始变量联合最小化。

3-1-1 缩放形式（Scaled Form）

定义残差 r=Ax+Bz−c，有
这里写图片描述
其中 u=(1/ρ)y 是 scaled dual variable。因此有

定义在 k 次迭代的残差为 rk=Axk+Bzk−c，有

3-2 收敛（Convergence）

这里写图片描述

收敛到一个高的精度要求很多次迭代；
但几十次迭代就可以达到一个合理的精度（类似于共轭梯度法(conjugate gradient method)）；
可以和其他算法组合来产生一个高的精度。

3-3 优化条件和停止准则（Optimality Conditions and Stopping Criterion）

3-3-1 优化条件（Optimality Conditions）

ADMM问题（3.1）的充分必要优化条件为：原始可行性（primal feasibility）
这里写图片描述
和对偶可行性（dual feasibility）

3-3-2 停止准则（Stopping Criterion）

原始残差：rk+1=Axk+1+Bzk+1−c<ϵprimal
对偶残差：sk+1=ρATB(zk+1−zk)<ϵdual

3-4 扩展和变化（Extensions and Variations）

3-4-1 不同的惩罚参数（Varying Penalty Parameter）

这里写图片描述

[96] B. S. He, H. Yang, and S. L. Wang, “Alternating direction method with selfadaptive p enalty parameters for monotone variational inequalities,” Journal of Optimization Theory and Applications, vol. 106, no. 2, pp. 337–356, 2000.
[169] S. L. Wang and L. Z. Liao, “Decomposition method with a variable parameter for a class of monotone variational inequality problems,” Journal of Optimization Theory and Applications, vol. 109, no. 2, pp. 415–429, 2001.

3-4-2 更一般的增广项（More General Augmenting Terms）

取代二次项(ρ/2)||r||22为 (1/2)rTPr，其中 P 是一个对称正定矩阵（symmetric positive definite matrix）。

3-4-3 过松弛（Over-relaxation）

这里写图片描述

[63] J. Eckstein and D. P. Bertsekas, “On the Douglas-Rachford splitting method and the proximal p oint algorithm for maximal monotone op erators,” Mathematical Programming, vol. 55, pp. 293–318, 1992.
[64] J. Eckstein and M. C. Ferris, “Operator-splitting methods for monotone affine variational inequalities, with a parallel application to optimal control,” INFORMS Journal on Computing, vol. 10, pp. 218–235, 1998.
[59] J. Eckstein, “Parallel alternating direction multiplier decomposition of convex programs,” Journal of Optimization Theory and Applications, vol. 80, no. 1, pp. 39–62, 1994.

3-4-4 不精确最小化（Inexact Minimization）

甚至当 x 和 z 最小化步骤不精确执行时， ADMM也会收敛。

[63] J. Eckstein and D. P. Bertsekas, “On the Douglas-Rachford splitting method and the proximal p oint algorithm for maximal monotone op erators,” Mathematical Programming, vol. 55, pp. 293–318, 1992.
[89] E. G. Gol’shtein and N. V. Tret’yakov, “Modified Lagrangians in convex programming and their generalizations,” Point-to-Set Maps and Mathematical Programming, pp. 86–97, 1979.

3-4-5 有序更新（Update Ordering）

执行 x-， z- 和 y-更新步骤不同的顺序或者多次。

[146] A. Ruszczy´nski, “An augmented Lagrangian decomposition method for block diagonal linear programming problems,” Operations Research Letters, vol. 8, no. 5, pp. 287–294, 1989.

参考或延伸材料：
[1]Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers
[2] 凸优化讲义
[3] A Note on the Alternating Direction Method of Multipliers

0 0