Hooking into NDIS and TDI

来源:互联网 发布:双11淘宝几折 编辑:程序博客网 时间:2024/05/01 02:28

Hooking into NDIS and TDI
By: andreas

This is the fist part in a series of 2 articles on how to hook into the NDIS and TDI layer. In this first one, we will discuss where and how to hook in to the NDIS layer.

In the second, we will do the same for TDI.

First, lets take a quick look at a quite simplicit view of the network stack in kernel space:
TDI
NDIS protocol layer
NDIS Intermediate layer
Miniport layer
Hardware

To be able to control data flow in NDIS, we have 3 potential points where we can either add a device / driver or hook into existing. First, we have the miniport layer, these are the drivers controlling the NIC hardware, which is a bit to low of a level for what we want at this time. Next, we have the intermediate layer. This layer would be perfect for this purpose, since it would allow us to contol the dataflow to all NDIS protocol drivers. But, it has a major drawback: To be able to add a driver to this layer, it has to be signed on a default install.

 

Depending on what system and under what circumstances we are installing this code, it might not be possible to get past this problem in an easy manner. Last, we have the protocol layer. Adding a driver to this layer would be easy, software such as WinPcap does that. However, in that case we will not be able to control what a user would see through for software relaying on these types of drivers, such as Ethereal. So, is there any way we can get around the driver signing issue in the intermediate layer and at the same time control data in the protocol layer and above? Yes! We can virually add a layer in between intermediate and protocol by hooking all NDIS protocol drivers and their protocol functions.

When an NDIS protocol is registered, it is recorded in a linked list. Each element in this linked list carries a pointer to a structure named NDIS_OPEN_BLOCK. This structure carries the pointers for all registered function pointers for the protocol. The linked list elements are made out of a structure looking something like the following:

 



It was a year or so since I played with this code, so I can actually not remember the real name of this struct. Interested persons can find it through google. This will also reflect in the source code later on, since it is relying on absolute offsets instead of the typedef'ed struct.

 

To be able to hook into all registered NDIS protocols, we need to find the first element in this linked list. This is actually returned by NdisRegisterProtocol as the NDIS_HANDLE. So, what we have to do is to register a bogus NDIS protocol, save the pointer and then remove the protocol. This will give us the ability to walk through the list of registered NDIS protocols and exchange existing function points to functions we control.

First, we register the bogus protocol to get the pointer. To make sure the registration does not fail, the protocol we register needs to have a ReceiveHandler:

 



Once we start walking the linked list and overwriting function pointers, we need to save the old pointers to be able to call them from our functions. There are atleast 2 ways of doing this:

 

1. Create a linked list of "hooked instances", holding the old pointers for each protocol. When our NDIS functions are called, the linked list has to be searched for the right element.

2. Allocate one instance of our functions for each protocol we hook and write the old pointer directly into the code of the function. This is slighly more work during hooking, but should be faster during run-time than searching through a linked list for every packet.

When this code was written, I never thought of option number 2, but that is probably the option I would use today. So, enjoy option 1, it works well and I haven't seen any major performance hits from it.

For every element in the NDIS registered protocol linked list, I allocate one element in my own list and save all important pointers together with 2 context handles. The handles values are later used to find the right element for the current protocol. Relevant pointers are then overwritten to point to my versions of the send and receive functions. We also save a pointer to the NDIS_OPEN_BLOCK itself to make unhooking easy. The code walking the list and hooking into the protocol would look something like this:

 



There are more functions in the NDIS_OPEN_BLOCK that might be of interest to hook, but if you only want to control network traffic, send and receive are enough. Another thing worth mentioning is that the NDIS_OPEN_BLOCK changes with OS versions. It looks different in Win2K compared to XP, mostly due to member names changing.

 

The next thing to do now is to implement send and recieve functions which searches through the linked list to find the original function pointers and then calls them if the traffic is to be passed on. If the traffic is to be altered, that is performed before calling the real protocol function. If the traffic is supposed to be dropped, we can just skip calling the real function and return with the appropriate status:

 


What is left to be done? The code does not hook into NDIS protocol being registred after NDIS is hooked into. This is left up to the reader to figure out, one way to do it can be found in the win32 version of sebek.

 

Does the code work? Sure, I use it in a win32 version of knockd called sesame that can be found at .http://www.toolcrypt.org/.
This is the second and last article on how to hook into the NDIS and TDI
layer. The approach we will use will be slightly different from the NDIS
case. However, a neat side effect is that this method can be used to hook
into any device chain, for example the keyboard to sniff key strokes. It all boils down to getting a pointer to the device object and replace all major functions with our own dispatch function.
To be able to fully control the TDI layer, we need access to the IRP both
before and after the original driver has processed it. If we have that, we
can choose what the original driver should process and we can also alter
results before they are returned to user-space. The "before filtering" can
be accomplished in our own, new dispatch function and the "after filtering" can be accomplished in a completion routine.
First, to be able to overwrite and insert our own dispatch function, we need a pointer to the driver object we are going to hook into. An easy way to get this pointer is to call ObReferenceObjectByName with the appropriate driver name. Then we only have to save all old function pointers and overwrite the existing ones with our own. The code to do this would look something like the following:


RealTDIDriverObject is a DRIVER_OBJECT where we save the original
information to both be able to call the old functions and also be able to
unhook once we are done. The orignal driver gets all its major functions
overwritten with a pointer to our own dispatch function, TDIDeviceDispatch.
We now have control over the IRPs before TDI can process them. But, we still have to make sure we can also control them once TDI is done with it but before it is returned to the IO handler and user-space. We will solve this in our dispatch function with the help of a completion routine. It is not as straight forward as it sounds, since we might be hooking the last entity in the chain, we can't just insert a completion routine with
IoSetCompletionRoutine (see the DDK docs), since it in that case never will be called. Completion routines are set in the next IRP stack location, not the current. If we are the last entity, there will be no next stack location in the IRP. Searching through the header files reveal IoSetCompletionRoutine as a macro which only gets the next IRP stack location and sets the CompletionRoutine pointer together with the Control element. Following the same principcal, we can set our own completion routine to regain control over the IRP with the following dispach function:

What we actually do is faking a scenario where the layer above set the
completion routine for us. We also save a potentially already existing
completion routine in the Context element of the IRP. Control is set to
invoke the completion routine in all cases. There are 2 potential issues
with this code. First, we overwrite whatever is in the Context element.
Second, we never save the Control element, so we don't know when to invoke
an already existing completion routine. So far, I have not seen any
side-effects from doing this.
The completion routine would look something like:

There is another way to accomplish the same result which utilizes a more
offically supported mode of operation. It is based upon attaching to the
device chain with GetDeviceObject and AttachToDevice, which will allow us to process all IRPs before the real device. Once in the dispatch function we contruct a new IRP and add a completion routine to regain control of the IRP before it is returned to the IO system and user-space.
One last important thing to mention; This code is quite untested. It seems
to work as intended but it has never been used in any major applications, so use it on your own risk. With that said, hope you have enjoyed this little article series.

 

 

原创粉丝点击