driver verifier检测驱动死锁

来源:互联网 发布:游戏核心算法 编辑:程序博客网 时间:2024/06/05 22:52

    最近,在写字符驱动练手,读写相关的派遣函数以异步StartIo方式处理IRP。测试时发现:当应用层发出了几次ReadFile请求后,驱动居然就无响应了。由于驱动是异步处理IO请求,因此,我猜测可能是驱动死锁了。本想借助windbg的!locks命令查看死锁,无奈,输出为空...绝望之余,想到可能可以用verifier工具检测驱动中潜在的死锁。

    命令行下有2种方式激活verifier.exe的死锁检测功能(方便起见,我选方式2):

我驱动的名字是SampleChar.sys1.重启生效的方式:verifier /flags 0x20 /driver SampleChar.sys2.立刻生效的方式:verifier /volatile /flags 0x20 /adddriver SampleChar.sys注:/flags 0x20用于设置死锁检测项---verifier.exe的死锁检测选项位于 Bit 5 (0x20)
     激活这个选项后,并不能马上将潜在驱动中的死锁分析出来,而是需要借助测试程序来覆盖驱动中的代码。

    下面是verifier分析出存在潜在死锁的代码片,代码原意是:当应用层调用ReadFile时,驱动调用IoStartPacket函数将IRP插入设备队列,然后异步返回。后续操作由StartIo完成。

#pragma code_seg()void SampleStartIo(PDEVICE_OBJECT devObj, PIRP irp){KEVENT workEvt, completeEvt;KIRQL origIrql;NTSTATUS status = STATUS_SUCCESS;unsigned long readLen;SampleCharDevContext* devCtx = (SampleCharDevContext*)devObj->DeviceExtension;IO_STACK_LOCATION* curStack = IoGetCurrentIrpStackLocation(irp);LARGE_INTEGER waitTime = RtlConvertLongToLargeInteger(-10*1000*1000*3);KeInitializeEvent(&workEvt,SynchronizationEvent,FALSE);KeInitializeEvent(&completeEvt, NotificationEvent, FALSE);KeWaitForSingleObject(&workEvt, Executive, KernelMode, FALSE, &waitTime);if (curStack->Parameters.Read.Length > 4096){status = irp->IoStatus.Status = STATUS_BUFFER_OVERFLOW;irp->IoStatus.Information = 0;IoCompleteRequest(irp, IO_NO_INCREMENT);IoStartNextPacket(devObj, FALSE);return;}KeAcquireSpinLock(&devCtx->devSpinLock, &origIrql);if(devCtx->buffPos != 0x00UL){readLen = devCtx->buffPos >= curStack->Parameters.Read.Length ? curStack->Parameters.Read.Length : devCtx->buffPos;RtlCopyMemory(irp->AssociatedIrp.SystemBuffer,devCtx->SampleBuff,readLen);devCtx->buffRemained += readLen;devCtx->buffPos -= readLen;}else{KeReleaseSpinLock(&devCtx->devSpinLock, origIrql);irp->IoStatus.Status = STATUS_SUCCESS;irp->IoStatus.Information = 0x00UL;IoCompleteRequest(irp, IO_NO_INCREMENT);return;}KeReleaseSpinLock(&devCtx->devSpinLock, origIrql);IoCopyCurrentIrpStackLocationToNext(irp);IoSetCompletionRoutine(irp, IrpAsyncReadCompleteRoutine, &completeEvt, TRUE, TRUE, TRUE);status = IoCallDriver(devCtx->lowerDev,irp);if (status == STATUS_PENDING){KeWaitForSingleObject(&completeEvt, Executive, KernelMode, FALSE, NULL);}irp->IoStatus.Status = STATUS_SUCCESS;irp->IoStatus.Information = readLen;IoCompleteRequest(irp, IO_NO_INCREMENT);IoStartNextPacket(devObj,FALSE);}NTSTATUS SampleCharReadAsync(PDEVICE_OBJECT devObj, PIRP irp){IoMarkIrpPending(irp);IoStartPacket(devObj,irp,NULL,NULL);return STATUS_PENDING;}

    只要测试程序一运行,立马会触发0xC4的错误:

kd> g*** Fatal System Error: 0x000000c4                       (0x00000122,0x00000002,0xA2047BA8,0xA2047BC8)Break instruction exception - code 80000003 (first chance)A fatal system error has occurred.Debugger entered on first try; Bugcheck callbacks have not been invoked.A fatal system error has occurred.
通过Windbg !analyze -v命令可以得到错误的原因(这里仅截取重要的信息):

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)  ----> C4是由driver verifer引发的错误Arguments:Arg1: 00000122, Waiting at DISPATCH_LEVEL, with a timeout different than zero. 参数1:0x122用于查看windbg help error codeArg2: 00000002, IRQL value.Arg3: a2047ba8, Object to wait on.Arg4: a2047bc8, Address of the time out value.FAULTING_SOURCE_CODE:     244: KeInitializeEvent(&completeEvt, NotificationEvent, FALSE);   245:    246: KeWaitForSingleObject(&workEvt, Executive, KernelMode, FALSE, &waitTime);   247: >  248: if (curStack->Parameters.Read.Length > 4096) ---->定位到引起蓝屏的函数栈   249: {   250: status = irp->IoStatus.Status = STATUS_BUFFER_OVERFLOW;   251: irp->IoStatus.Information = 0;   252:    253: IoCompleteRequest(irp, IO_NO_INCREMENT);
    windbg给出了这么多信息,其实已经够定位错误原因了。再参考windbg Help对错误号0x122给出的解释:

The thread waits at DISPATCH_LEVEL and Timeout value is not equal to zero (0). If the Timeout != 0, the callers of KeWaitForSingleObject or KeWaitForMultipleObjects must run at IRQL <= APC_LEVEL.
基本知道是在高IRQL级别上调用了等待相关的函数,就是这一句。
KeWaitForSingleObject(&workEvt, Executive, KernelMode, FALSE, &waitTime);

    我的代码参考了windows驱动开发详解第9章关于StartIo部分实现,其中用到定时器使得驱动在StartIo上等待一段时间然后再继续执行。开始时,我没有注意到StartIo调用时IRQL==DPC,不宜调用线程等待的函数(其实,不用driver verifier测试时,驱动运行的也还看得过去,至少没蓝屏)。去掉这段wait代码后再次编译加载,再用Driver verifier测试驱动死锁,倒是没有再次蓝屏的现象~

    虽然,还没有解决死锁问题,但意外解决了一个隐藏的错误,也挺不错~

最后附上相关的链接:

死锁检测

!deadlock

0 0
原创粉丝点击