大型稀疏矩阵计算的现代方法介绍

来源:互联网 发布:grpc java 编辑:程序博客网 时间:2024/05/12 02:25

--  

    前面有2篇博文:矩阵迭代, 稀疏矩阵相关。初学的认识。最近又来咀嚼回头草了,仍然参考两本书:

    Iterative Methods for linear sparse systems 

    Lectcutres on computational numerical analysis of PDE. J.M.McDonough, Kentucky.

    第一本是hpc课,老师主推荐;第二本的作者,写了几个lectures都非常好,e.g. <cfd>, <turbulenence>

    

-  经典迭代方法

*    Jacobi, Gauss-Seidel, SOR serials :

     - to solve Au=b,  decompose A = D - L - U

D: dialogue part of A, L : lower part of A; U : upper part of A

        u^(n+1) = G u^n + k

     - for convergence test, contrast fixed-point iteration form, and through contraction mapping principle( 压缩映射定理). requiring spectral of system less than 1 .

        namely  rho(G) < 1     

  Jacobi  G_ja =  D;  

  G-S      G_gs =  D - L;

  SOR     G_sor =  (D- wL)/w,  w is relaxation paramter

  SSOR  G_ssor = (D - wL)D^(-1)(D - wF)/ (w(2-w))  

   

 *    ADI

      decompose A = H + V + S

H: x-direction term; V: y/z-direction term; S: 0-order term

      (rI + H1) u* = (rI - V1) u^n + b; 

     (rI + V1) u^(n+1) = (rI - H1)u* + b;

 

*  ILU 

    in this case, the A is sparse banded but not compact, so find a nearly sparse matrix M, and easily do LU decomposition

    M = A + E ,  E is the error matrix

总结:  Jacobi, GS, SOR, ADI,  ILU 等传统迭代方法,收敛速度慢(?)。一般用作其他现代迭代算法的预处理算子(preconditioners), 而较少场合单独使用。

 预处理算子的作用,使矩阵A的条件数(最大特征值非负)小于1, 越小越好。那么最优预处理算子是将矩阵A化为单位矩阵。


二,迭代加速算法

经典迭代算法的收敛慢的原因,在于每个迭代步,仅抵消了部分自由度上的相对误差。而迭代加速算法的基本思想就是,在每个迭代步上,都抵消所有自由度上的(相对)误差,即第n近似解的误差。加速算法实际上是寻找近似解空间的投影算法 。  包括cg, gmres, ksp


  * CG, KSP

    见前两篇博文。

 

    *Multi-Grid

    1  using relaxation-type methods as smoothing operators on fine grids; 

    2  recursive application of coarse-grid corrections

    3  nested iteration

 

  <img src="http://latex.codecogs.com/gif.latex?  A_h u_h = b_h" />

  <img src="http://latex.codecogs.com/gif.latex?    u_h = u_h^n - e_h^n " />  

  <img src="http://latex.codecogs.com/gif.latex?    A_h e_h^n = r_h^n " />   --> defect equation

  Accurately solving defect equation needs same amount of computing as the original equation. while if  e_h is smooth enough we can solve it on a coarser grid, and still obtain a useful improvement in u_h, namely:

  <img src="http://latex.codecogs.com/gif.latex?    A_{2h} e_{2h}^n = r_{2h}^n " />

 then,    <img src="http://latex.codecogs.com/gif.latex?     u_h^{n+1} = u_h^n + e_{2h}^n " />

  there are two transfer, first from fine-to-coarse grid    <img src="http://latex.codecogs.com/gif.latex?    r_{2h} = R_h^{2h} r_h " />

  and second from coarse-to-fine grid <img src="http://latex.codecogs.com/gif.latex?   e_h = P_{2h}^h e_{2h} " />

   In fixed-point iteration form, we can see the convergence can be satisfied when 

   <img src="http://latex.codecogs.com/gif.latex?   \rho ( I - P_{2h}^h A_{2h}^{-1}R_h^{2h} ) < 1 " />

   If the system have lots of DoF ( >10e6),  high-level grid will be helpful, and also full MG with nested iterations.  


     * Domain Decomposition

      1 initially guess boundary values on Boundary B13, solving PDE in subdomain  S23 =  ( S2 U S3)

      2 use the computed results to provide boundary values on Boundary B23, then solving PDE in subdomain S13 = (S1 U S3), and yielding a better boundary values fro B13

      3  convergence test or go to next loop

     

总结:  MG, DD, 都解决了传统迭代方法和ksp方法的一个严重问题,当系统自由度增大,收敛速度下降。当然,MG, DD也使用了更多case信息。这使它们相比前面的方法复杂一点。dd 的另一个优势,在于和mpi域分解的 并行思路不谋而合,非常promising.

   

0 0
原创粉丝点击