Kettle转换(Trans)执行流程分析

来源:互联网 发布:mac视频播放器换音频 编辑:程序博客网 时间:2024/04/29 14:52

1. Kettle转换执行流程

Kettle转换执行流程体现在Trans类的execute()方法,代码如下所示:

public void execute( String[] arguments ) throws KettleException {    prepareExecution( arguments );    startThreads();}

1.1 prepareExecution流程分析

prepareExecution方法完成的主要工作是:

  • 设置转换参数、变量
  • 处理步骤之间数据传递是分发还是复制
  • 初始化设置转换日志表、步骤日志表、性能记录表、添加一系列TransListener
  • 构造StepMetaDataCombi对象集合,该对象负责传递Step,StepMeta以及StepData对象给RunThread
  • 调用每个Step的init方法进行步骤初始化,确保所有Step的初始化都顺利进行,出错就提前结束所有步骤
  • 步骤监控的初始化

1.2 startThreads流程分析

startThreads方法完成的主要工作是:

  • TransListener和StepListener构造
  • 当listener准备好之后就根据转换类型启动线程跑转换里的步骤,通过构造RunThread对象来执行步骤
  • 当所有步骤完成会触发StepListener的stepFinished方法里面的fireTransFinishedListeners(),通知转换,步骤已经全部执行结束,转换可以结束了,设置相关参数并清理现场

核心StepListener构造片段代码如下所示:

StepListener stepListener = new StepListener() {   .....    public void stepFinished( Trans trans, StepMeta stepMeta, StepInterface step ) {      synchronized ( Trans.this ) {        nrOfFinishedSteps++;        if ( nrOfFinishedSteps >= steps.size() ) {          // Set the finished flag          //          setFinished( true );          // Grab the performance statistics one last time (if enabled)          //          addStepPerformanceSnapShot();          try {            fireTransFinishedListeners();          } catch ( Exception e ) {            step.setErrors( step.getErrors() + 1L );            log.logError( getName()              + " : " + BaseMessages.getString( PKG, "Trans.Log.UnexpectedErrorAtTransformationEnd" ), e );          }        }        // If a step fails with an error, we want to kill/stop the others        // too...        //        if ( step.getErrors() > 0 ) {          log.logMinimal( BaseMessages.getString( PKG, "Trans.Log.TransformationDetectedErrors" ) );          log.logMinimal( BaseMessages.getString(            PKG, "Trans.Log.TransformationIsKillingTheOtherSteps" ) );          killAllNoWait();        }      }    }  };  // Make sure this is called first!  //  if ( sid.step instanceof BaseStep ) {    ( (BaseStep) sid.step ).getStepListeners().add( 0, stepListener );  } else {    sid.step.addStepListener( stepListener );  }

fireTransFinishedListeners方法代码如下所示:

/**   * Make attempt to fire all registered listeners if possible.   *   * @throws KettleException   *           if any errors occur during notification   */protected void fireTransFinishedListeners() throws KettleException {    // PDI-5229 sync added    synchronized ( transListeners ) {      if ( transListeners.size() == 0 ) {        return;      }      //prevent Exception from one listener to block others execution      List<KettleException> badGuys = new ArrayList<KettleException>( transListeners.size() );      for ( TransListener transListener : transListeners ) {        try {          transListener.transFinished( this );        } catch ( KettleException e ) {          badGuys.add( e );        }      }      // Signal for the the waitUntilFinished blocker...      transFinishedBlockingQueue.add( new Object() );      if ( !badGuys.isEmpty() ) {        //FIFO        throw new KettleException( badGuys.get( 0 ) );      }    }}

fireTransFinishedListeners方法负责通知所有的TransListener转换已经结束,通过调用

transListener.transFinished(this)

来完成通知。这个方法还负责解除waitUntilFinished方法调用的阻塞状态。waitUntilFinished方法在execute方法执行后调用可以等待转换结束或者出错才返回。这是因为该方法利用了一个阻塞队列(BlockingQueue)transFinishedBlockingQueue的poll方法来进行阻塞,而只有当上面讲到的fireTransFinishedListeners方法触发了,才会执行

// Signal for the the waitUntilFinished blocker...transFinishedBlockingQueue.add( new Object() );

来解除阻塞队列的阻塞状态。

1.3 RunThread.run()流程分析

一般情况下转换里的每个步骤隔离到单独的线程执行,步骤执行的逻辑代码表现在类RunThread(实现了Runnable接口)的run方法里面,核心逻辑很简单,调用step的processRow方法直到没有输入数据要处理表示该步骤已经结束,结束的时候会调用step.dispose清理现场资源,最后调用step.markStop()标记步骤已经停止,在markStop方法里面会回调所有StepListener的stepFinished方法(实现在BaseStep类里面),因此在Trans类的startThreads方法里面构造的核心StepListener方法将被调用。

RunThread的run方法代码如下所示:

public void run() {    try {      step.setRunning( true );      step.getLogChannel().snap( Metrics.METRIC_STEP_EXECUTION_START );      if ( log.isDetailed() ) {        log.logDetailed( BaseMessages.getString( "System.Log.StartingToRun" ) );      }      // Wait      while ( step.processRow( meta, data ) ) {        if ( step.isStopped() ) {          break;        }      }    } catch ( Throwable t ) {      try {        // check for OOME        if ( t instanceof OutOfMemoryError ) {          // Handle this different with as less overhead as possible to get an error message in the log.          // Otherwise it crashes likely with another OOME in Me$$ages.getString() and does not log          // nor call the setErrors() and stopAll() below.          log.logError( "UnexpectedError: ", t );        } else {          t.printStackTrace();          log.logError( BaseMessages.getString( "System.Log.UnexpectedError" ), t );        }        String logChannelId = log.getLogChannelId();        LoggingObjectInterface loggingObject = LoggingRegistry.getInstance().getLoggingObject( logChannelId );        String parentLogChannelId = loggingObject.getParent().getLogChannelId();        List<String> logChannelChildren = LoggingRegistry.getInstance().getLogChannelChildren( parentLogChannelId );        int childIndex = Const.indexOfString( log.getLogChannelId(), logChannelChildren );        System.out.println( "child index = "          + childIndex + ", logging object : " + loggingObject.toString() + " parent=" + parentLogChannelId );        KettleLogStore.getAppender().getBuffer( "2bcc6b3f-c660-4a8b-8b17-89e8cbd5b29b", false );        // baseStep.logError(Const.getStackTracker(t));      } catch ( OutOfMemoryError e ) {        e.printStackTrace();      } finally {        step.setErrors( 1 );        step.stopAll();      }    } finally {      step.dispose( meta, data );      step.getLogChannel().snap( Metrics.METRIC_STEP_EXECUTION_STOP );      try {        long li = step.getLinesInput();        long lo = step.getLinesOutput();        long lr = step.getLinesRead();        long lw = step.getLinesWritten();        long lu = step.getLinesUpdated();        long lj = step.getLinesRejected();        long e = step.getErrors();        if ( li > 0 || lo > 0 || lr > 0 || lw > 0 || lu > 0 || lj > 0 || e > 0 ) {          log.logBasic( BaseMessages.getString( PKG, "BaseStep.Log.SummaryInfo", String.valueOf( li ),            String.valueOf( lo ), String.valueOf( lr ), String.valueOf( lw ),            String.valueOf( lu ), String.valueOf( e + lj ) ) );        } else {          log.logDetailed( BaseMessages.getString( PKG, "BaseStep.Log.SummaryInfo", String.valueOf( li ),            String.valueOf( lo ), String.valueOf( lr ), String.valueOf( lw ),            String.valueOf( lu ), String.valueOf( e + lj ) ) );        }      } catch ( Throwable t ) {        //        // it's likely an OOME, so we don't want to introduce overhead by using BaseMessages.getString(), see above        //        log.logError( "UnexpectedError: " + Const.getStackTracker( t ) );      } finally {        step.markStop();      }    }}
0 0
原创粉丝点击