一个STAF的RC21的问题的解决和思考

来源:互联网 发布:灭门案 知乎 编辑:程序博客网 时间:2024/06/18 03:43


最近的项目需要用到STAF来向guest VM来传递一些指令,环境搭建好之后,host端测试时候经常会出现报错,返回值21。


查找staf的 API Return Codes ,RC21 的解释如下:

21STAF Not RunningThis indicates that STAFProc is not running on the local machine with the same STAF_INSTANCE_NAME (and/or the same STAF_TEMP_DIR if on a Unix machine).

Notes:

  1. If the STAF_INSTANCE_NAME environment variable is not set, it defaults to "STAF".
  2. On Unix, if the STAF_TEMP_DIR environment variable is not set, it defaults to "/tmp". This environment variable is not used on Windows.
  3. This error can also occur when submitting a request using the local IPC interface on a Unix machine if the socket file that the local interface uses has been inadvertently deleted.
  4. On Windows, with User Account Controls (UAC) enabled, if STAFProc.exe is being run as an Administrator, this error will occur if a STAF service request is not also run as an Administrator (e.g. from an "Administrator: Command Prompt") or if programs that submit STAF service requests using STAF APIs for Java, C/C++, Perl, Python, or Tcl are not run an an Administrator. See section "5.1.2 Running STAFProc on Windows with User Account Controls (UAC) Enabled" in the STAF User's Guide for more information.
  5. More information on this error may be displayed if you set special environment variable STAF_DEBUG_21=1 and resubmit your STAF service request.

上面大致说了问题可能出在哪几个方面。

起初我们怀疑问题出在STAF_INSTANCE_NAME 这个环境变量里面,

于是修改/usr/local/staf下面的STAFEnv.sh,强制TAF_INSTANCE_NAME=staf,最终测试结果显示无卵用。

于是又按照google来的资料,重新跑

startSTAFProc.sh 和 STAFEnv.sh 来初始化staf的环境变量。

然后用 staf local ping ping来测试,依然显示,Error registering with STAF, RC: 21

事情到这里似乎走到了一个死胡同,但是网络资料显示,在staf的初始化导入环境变量的之后,要跑一下 tail nohup.out来查看一下初始化的结果。

于是重新初始化环境变量, tail nohup.out,这个时候发现问题了了,

Error starting ssl interface. Error code: 10. Reason : STAFConnectionProviderStart: Error binding server socket, bind() RC=98

这个bind RC98 ,我在查看Frequently Asked Questions About STAF V3, STAX, and STAF Services 里面这么说的:

This error can occur on Unix if STAFProc has not been shutdown correctly. The error will be displayed when you attempt to restart STAFProc. You should always shutdown STAF by submitting a SHUTDOWN request to the STAF SHUTDOWN service. For example:

STAF local SHUTDOWN SHUTDOWN

If you don't submit a SHUTDOWN request and instead kill the STAFProc process, then STAFProc is not given the opportunity to kill other processes that it started and to perform clean-up activities such as deleting it's temporary files.

然后里面给了解决办法:

1,

Go to the /tmp directory, and delete the temporary files that STAF created for the instance of STAFProc that you're trying to start. You'll need to be logged on as a user that has permission to remove these files. The temporary STAF files are named:

  • XXXX.tmp
  • DataDir_*XXXX.tmp
  • STAFIPC_XXXX
  • All files beginning with STAFIPC_XXXXJSTAF. Note that there could be one of these files for each STAF JVM that was created when registering STAF Java services.
2,
Also, type ps or ps -ea and determine if there are any processes that STAF started which are still running or any java executables that STAF started for it's Java services. If there are any, you'll need to stop these processes in order to restart STAFProc. You can do this by typing kill xxx where xxx is the PID for the process.


3,

Restart STAFProc


总而言之,意思就是说你需要在跑完一轮staf操作之后来个 STAF local SHUTDOWN SHUTDOWN 这个命令,要及时把后台关掉,

然后给你支了三个招儿:一, 清理STAF在/tmp下面的临时文件;二, 直接找到staf的进程,杀掉;三,重启serveice

分别试了一和儿这两个办法,还是二最好用,直接找到pid,kill之。


以上,简单记录一下解决这个困扰了很久的STAF ping的问题。写的有点儿乱,能大致看懂就行,哈哈。


总结一下本次得失:

1, 有时候经验丰富的专家给出来的建议也仅仅只能作为参考,自己要吃透架构,有自己的理解;

2, 虽然解决问题有时候靠灵光一闪,但是,灵光一闪之前的大量的尝试、海量的资料检索,才是成功的基础;


btw, 如果error code的RC是16, 那么你要怀疑是不是防火墙的问题了,特别是windows系统的firewall。



0 0
原创粉丝点击