linux 下开发 segfault 错误调试
来源:互联网 发布:中央音乐学院附中 知乎 编辑:程序博客网 时间:2024/05/21 10:57
Debugging segfaults from logs to gdb
So this week after a version upgrade on GraphicsMagick we got some segfaults on our servers. Nothing terrible, twelve segfaults or close to that on a 24 hour period. The only information was a line on /var/log/kernel.log
:
No core dumps since ulimit -c
is zeroed. What to do to at least have an idea of what is happening?
Well luckily I build the packages for our internal use so I had he build directory available with the unstripped binaries, with that it’s trivial to use the GNU Debugger (gdb) and find what is going on.
First, notice that the segfault is happening on a shared lib, this is per se a complication. You see, when you have the segfault to happen on a non-shared lib binary the ip (instruction pointer) value points to the instruction on the binary, in this case it is pointing to a shared lib, dynamically linked on the gm
binary.
To find the instruction, then, subtract the offset given on the segfault message (it’s the 7fd1379b9000 part after the lib’s name) from the ip:
Finally, using GDB you can check what is happening @ that addres on the library, provided you have an unstripped object (you can get it with -dbg packages on debian/ubuntu):
There’s the culprit. You can also find some info on the stripped library using nm
, remember that nm
will not show anything on shared libs if not used with the -D
option (showing just part of the output):
You can see that there are some PNG related symbols around the address 0x21xxxx. If you check the code for GraphicsMagic PNG support you will see that WritePNGImage
is part of the RegisterPNGImage
code.
In this case I correlated the logs and found that the request that caused the segfault completed without problems and the PNG image was correctly generated, so my conclusion is that the segfault is happening on some non-crucial part of those functions, but there’s not a lot of things to do exactly pinpoint the problem.
gdb
, nm
and ldd
are powerful tools when debugging or trying to do a postmortem on a segfault. It would be easier to find what exactly is going on with a core dump and maybe more info.
注:segfault时错误码:
error number是由5位组成的,从高到底分别为bit4,bit3,bit2 bit1和bit0,所以它的取值范围是0~31
bit4: 0无意义,1表示取指令时出错
bit3: 0无意义,1表示与页相关的数据结构保留位被修改
bit2: 值为1表示是用户态程序内存访问越界,值为0表示是内核态程序内存访问越界
bit1: 值为1表示是写操作导致内存访问越界,值为0表示是读操作导致内存访问越界
bit0: 值为1表示没有足够的权限访问非法地址的内容,值为0表示访问的非法地址根本没有对应的页面,也就是无效地址
- linux 下开发 segfault 错误调试
- Linux下打开core文件,定位segfault
- Linux下 错误调试
- linux 下嵌入式开发调试
- linux下段错误调试方法
- Linux下的段错误调试
- Linux下的段错误调试方法
- Linux下的段错误调试方法
- linux下段错误的调试方法
- Linux下的段错误调试方法
- Linux下段错误以及调试方法
- 一次segfault错误的排查过程
- 一次segfault错误的排查过程
- LINUX 段错误查找记录 -- segfault at fffffffffffffff9 ip 0000003c97e7b81c sp 00007fffad7c0638 error 4 in lib
- Linux下的段错误的原因及调试
- Linux下段错误的原因以及调试方法
- Linux 下的段错误(Segmentation fault)调试方法(转)
- Linux下的段错误的原因及调试
- VisJS 随机图
- javascript arguments
- RAC环境下的阻塞(blocking blocked)
- 获取元素在数组中的位置
- Echo服务器
- linux 下开发 segfault 错误调试
- POJ2752 Seek the Name, Seek the Fame 【KMP】
- iOS 配置文档是否可以共享
- Objective C 快速入门学习二
- spring注入与new
- VS2008报错LINK:fatal error LNK1000: Internal error during IncrBuildImage
- 代码从windows下visual studio到andriod平台迁移的修改记录
- unity简单AI
- Spring jar包详解