cgroup资源隔离内存OOM事件监听(oom notifier)
来源:互联网 发布:js array find 编辑:程序博客网 时间:2024/05/15 23:50
CGROUP OOM控制
CGROUP是目前比较流行也比较常用的资源隔离技术,包括docker,hadoop都是使用cgroup做的资源隔离。当对内存做资源隔离时,当进程OOM后,可以选择直接kill进程,也可以不kill,默认选项是oom之后直接kill。可以通过以下方式关闭该功能:
echo 1 > memory.oom_control
OOM事件捕捉
但是当进程oom将进程kill掉之后,很难捕捉到oom日志,针对这种情况,cgroup提供了一种监听oom事件的方式,并提供了C语言实现方式。
#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <sys/eventfd.h>#include <errno.h>#include <string.h>#include <stdio.h>#include <stdlib.h>static inline void die(const char *msg){fprintf(stderr, "error: %s: %s(%d)\n", msg, strerror(errno), errno);exit(EXIT_FAILURE);}static inline void usage(void){fprintf(stderr, "usage: oom_eventfd_test <cgroup.event_control> <memory.oom_control>\n");exit(EXIT_FAILURE);}#define BUFSIZE 256int main(int argc, char *argv[]){char buf[BUFSIZE];int efd, cfd, ofd, rb, wb;uint64_t u;if (argc != 3)usage();if ((efd = eventfd(0, 0)) == -1)die("eventfd");if ((cfd = open(argv[1], O_WRONLY)) == -1)die("cgroup.event_control");if ((ofd = open(argv[2], O_RDONLY)) == -1)die("memory.oom_control");if ((wb = snprintf(buf, BUFSIZE, "%d %d", efd, ofd)) >= BUFSIZE)die("buffer too small");if (write(cfd, buf, wb) == -1)die("write cgroup.event_control");if (close(cfd) == -1)die("close cgroup.event_control");for (;;) {if (read(efd, &u, sizeof(uint64_t)) != sizeof(uint64_t))die("read eventfd");printf("mem_cgroup oom event received\n");}return 0;}
具体可参照https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-memory.html#ex-OOM-control-notifications
JAVA中捕捉OOM KILL事件
在JAVA中想捕捉oom kill事件,采用java调用c的方式来实现,通过JNI方式有很多插件可以方便的调用c程序。我的工程是maven,使用这个工具
<dependency> <groupId>org.fusesource.hawtjni</groupId> <artifactId>hawtjni-runtime</artifactId> <version>1.9</version> </dependency>写自己的native方法来对应c的方法就可以了
package ji;import org.fusesource.hawtjni.runtime.JniArg;import org.fusesource.hawtjni.runtime.JniClass;import org.fusesource.hawtjni.runtime.JniMethod;import org.fusesource.hawtjni.runtime.Library;/** * Created by ji on 17-5-18. */@JniClasspublic class OomNotifierNative { private static final Library LIBRARY = new Library("native-oom-notifier", OomNotifierNative.class); static { LIBRARY.load(); } @JniMethod(cast = "char *") public static final native long oom_event_listener(@JniArg(cast = "char *") String ptr, @JniArg(cast = "char *") String ptr2);}对应的C方法如下:
#include "notifier.h"#include <stdio.h>#include <sys/types.h>#include <sys/stat.h>#include <fcntl.h>#include <sys/eventfd.h>#include <errno.h>#include <string.h>static inline void die(const char *msg){fprintf(stderr, "error: %s: %s(%d)\n", msg, strerror(errno), errno);exit(EXIT_FAILURE);}static inline void usage(void){fprintf(stderr, "usage: oom_eventfd_test <cgroup.event_control> <memory.oom_control>\n");exit(EXIT_FAILURE);}#define BUFSIZE 256int oom_event_listener(char *event_ctrl,char *oom_ctrl){ char buf[BUFSIZE]; int efd, cfd, ofd, rb, wb; uint64_t u; if ((efd = eventfd(0, 0)) == -1) die("eventfd"); if ((cfd = open(event_ctrl, O_WRONLY)) == -1) die("cgroup.event_control"); if ((ofd = open(oom_ctrl, O_RDONLY)) == -1) die("memory.oom_control"); if ((wb = snprintf(buf, BUFSIZE, "%d %d", efd, ofd)) >= BUFSIZE) die("buffer too small"); if (write(cfd, buf, wb) == -1) die("write cgroup.event_control"); if (close(cfd) == -1) die("close cgroup.event_control"); for (;;) { if (read(efd, &u, sizeof(uint64_t)) != sizeof(uint64_t)) die("read eventfd"); if (access(event_ctrl,0)==-1){ printf("group not exists\n"); return 2; } printf("mem_cgroup oom event received\n"); return 1; } return 0;}使用maven打包的时候用了下面这个插件
<plugin> <groupId>org.fusesource.hawtjni</groupId> <artifactId>maven-hawtjni-plugin</artifactId> <version>1.9</version></plugin>
阅读全文
0 0
- cgroup资源隔离内存OOM事件监听(oom notifier)
- 资源隔离-cgroup
- cgroup-资源隔离
- OOM
- OOM
- OOM
- OOM
- OOM
- oom
- OOM
- oom
- oom
- OOM
- oom
- OOM
- OOM
- oom
- oom
- 高级软件工程课程理解及学习软件工程(C编码实践篇)的心得
- Python中yield使用解析
- 【动态规划17】bzoj3675 [Apio2014]序列分割(斜率优化)
- C++项目经验总结
- 硬盘无法访问 D:\参数不正确不丢失数据解决方法 [记录]
- cgroup资源隔离内存OOM事件监听(oom notifier)
- 数据预处理
- 使用LR做Doubbo接口的性能测试
- hadoop 处理二进制文件以查找海量图片中相同图片为例
- Mybatis 03 整合Spring+逆向工程
- java类的问题
- 计算机网络概述
- Android移动开发 实现按钮机制的三种方法
- 【TGB官网】图片上传实现