Introduction To Reentrancy

来源：互联网发布：cnc电脑编程软件有哪些编辑：程序博客网时间：2024/04/30 08:21

source: http://en.wikipedia.org/wiki/Reentrancy_(computing)

In computing, a computer program or subroutine is calledreentrant if it can be interrupted in the middle of its execution and then safely called again ("re-entered") before its previous invocations complete executing. The interruption could be caused by an internal action such as a jump or call or by an external action such as ahardware interrupt orsignal. Once the reentered invocation completes, the previous invocations will resume correct execution.

This definition originates from single-threaded programming environments where the flow of control could be interrupted by ahardware interrupt and transferred to aninterrupt service routine (ISR). Any subroutine used by the ISR that could potentially have been executing when the interrupt was triggered should be reentrant. Often, subroutines accessible via the operating system kernel are in fact not reentrant. Hence, interrupt service routines are limited in the actions they can perform and usually restricted from accessing the file system or even from allocating memory.

A subroutine that is directly or indirectly recursive should be reentrant. This policy is partially enforced by structured programming languages. However a subroutine can fail to be reentrant if it relies on a global variable to remain unchanged but that variable is modified when the subroutine is recursively invoked.

The definition of reentrancy originated in single-threaded environments and differs from that ofthread-safety in multi-threaded environments. A reentrant subroutine can achieve thread-safety,^[1] but this condition alone might not be sufficient in all situations. Conversely, thread-safe code does not necessarily have to be reentrant (see below for examples).

[hide]

1Example
2Derivation and explanation of rules
3Reentrant interrupt handler
4Examples
5Relation to thread safety
6See also
7Notes
8References
9External links

[edit]Example

This is an example of a swap() function which fails to be reentrant (as well as thread-safe). As such, it should not have been used in the interrupt service routineisr():

int t; void swap(int *x, int *y){        t = *x;        *x = *y;        // hardware interrupt might invoke isr() here!        *y = t;} void isr(){        int x = 1, y = 2;        swap(&x, &y);}

swap() could be made thread-safe by making t thread-local. It still fails to be reentrant and this will continue to cause problems ifisr() is called in the same context as a thread already executingswap().

The following, somewhat contrived, modification of the swap function, which is careful to leave the global data in a consistent state at the time it exits is perfectly reentrant, but not thread-safe. because it does not ensure the global data is in a consistent state during execution:

int t; void swap(int *x, int *y){        int s;         s = t;  // save global variable        t = *x;        *x = *y;        // hardware interrupt might invoke isr() here!        *y = t;        t = s;  // restore global variable} void isr(){        int x = 1, y = 2;        swap(&x, &y);}

[edit]Derivation and explanation of rules

Reentrancy is not the same thing as idempotence (meaning that the function may be called more than once, yet generate exactly the same output as if it had only been called once). Generally speaking, a function produces output data based on some input data (though both are optional, in general). Shared data could be accessed by anybody at any time. If data can be changed by anybody (and nobody keeps track of those changes) then there's no guarantee for those who share a datum whether that datum is the same as at any time before. Idempotence implies reentrancy, but the converse is not necessarily true.

Data are of global (outside thescope of any function and with an indefiniteextent) or local (created each time a function is called and destroyed upon exit) scope.

Local data are not shared by any, re-entering or not, routines; therefore they don't affect re-entrance. Global data are either shared by any function, calledglobal variables, or shared by all functions of the same name, calledstatic variables; therefore they can affect it.

Must hold no static (or global) non-constant data.

Reentrant functions can use global data to work with. For example, a reentrant interrupt service routine could grab a piece of hardware status to work with (e.g. serial port read buffer) which is not only global, but volatile. Still typical use of static variables and global data is not advised, in the sense of no non-atomic-read-modify-write instructions should be used in these variables

Must not modify its own code.

The operating system might allow a process to modify its code. There are various reasons for this (blitting graphics quickly, ignorance of OS programmers) but the fact is that code might not be the same next time. It may modify itself if it resides in its own unique memory. That is, if each new invocation uses a different physical machine code location where a copy of the original code is made, it will not affect other invocations even if it then modifies itself during execution of that particular thread).

Must not call non-reentrant computer programs or routines.

Multiple levels of 'user/object/process priority' and/or multiprocessing usually complicate the control of reentrant code. It is important to keep track of any access and or side effects that are done inside a routine designed to be reentrant.

Reentrancy is a key feature of functional programming.

Any recursive subroutines need to be reentrant.

Also, subroutines that are directly or indirectly called from an interrupt handler must to be reentrant if there is need to service an interrupt before the previous is already served.

[edit]Reentrant interrupt handler

A "reentrant interrupt handler" is an interrupt handler that re-enables interrupts early in the interrupt handler. This may reduceinterrupt latency.^[2] In general, while programming interrupt service routines, it is recommended to re-enable interrupts as soon as possible in the interrupt handler. This helps to avoid losing interrupts.^[3]

[edit]Examples

In the following piece of C code, neither functions f nor g are reentrant.

int g_var = 1; int f(){        g_var = g_var + 2;        return g_var;} int g(){        return f() + 2;}

In the above, f depends on a non-constant global variable g_var; thus, if two threads execute it and access g_var concurrently, then the result varies depending on the timing of the execution. Hence,f is not reentrant. Neither isg; it callsf, which is not reentrant.

These slightly altered versions are reentrant:

int f(int i){        return i + 2;} int g(int i){        return f(i) + 2;}

In the following piece of C code, the function is thread-safe, but not reentrant

int function(){        mutex_lock();        ...        function body        ...        mutex_unlock();}

In the above, function can be called by different threads without any problem. But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.

[edit]Relation to thread safety

This concept is distinct from, but closely related to, thread-safe. A function can be thread-safe and still not reentrant. For example, a function could be wrapped all around with a mutex (which avoids problems in multithreading environments), and if that function is used as a reentrant function in an interrupt service routine, it could starve waiting for the first execution to release the mutex. The key for avoiding confusion is that reentrant refers to onlyone thread executing. It is a concept from the time when no multitasking operating systems existed.

[edit]Notes

^Kerrisk 2010, p. 657.
^ "ARM System Developer's Guide" by Andrew N. Sloss, Dominic Symes, Chris Wright, John Rayfield 2004, page 342.
^ "Safe and structured use of interrupts in real-time and embedded software" by John Regehr 2006

[edit]References

Kerrisk, Michael (2010). The Linux Programming Interface. No Starch Press.

[edit]External links

Article "Use reentrant functions for safer signal handling" byDipak K Jha
"Writing Reentrant and Thread-Safe Code," fromAIX Version 4.3 General Programming Concepts: Writing and Debugging Programs, 2nd edition, 1999.
Jack Ganssle (2001). "Introduction to Reentrancy".EE Times.
Raymond Chen (2004).The difference between thread-safety and re-entrancy.The Old New Thing.

Source: http://www.eetimes.com/discussion/beginner-s-corner/4023308/Introduction-to-Reentrancy

Introduction to Reentrancy

Jack Ganssle

3/15/2001 1:03 PM EST

Introduction to Reentrancy
Virtually every embedded system uses interrupts; many support multitasking or multithreaded operations. These sorts of applications can expect the program's control flow to change contexts at just about any time. When that interrupt comes, the current operation gets put on hold and another function or task starts running. What happens if functions and tasks share variables? Disaster surely looms if one routine corrupts another's data.

By carefully controlling how data is shared, we create reentrant functions, those that allow multiple concurrent invocations that do not interfere with each other. The word pure is sometimes used interchangeably with reentrant.

Like so many embedded concepts, reentrancy came from the mainframe era, in the days when memory was a valuable commodity. In those days compilers and other programs were often written to be reentrant, so a single copy of the tool lived in memory, yet was shared by perhaps a hundred users. Each person had his or her own data area, yet everyone running the compiler quite literally executed the identical code. As the operating system changed contexts from user to user it swapped data areas so one person's work didn't effect any other. Share the code, but not the data.

In the embedded world a routine must satisfy the following conditions to be reentrant:

It uses all shared variables in an atomic way, unless each is allocated to a specific instance of the function.
It does not call non-reentrant functions.
It does not use the hardware in a non-atomic way.

Quite a mouthful! Let's look at each of these in more detail.

Atomic variables

Both the first and last rules use the word atomic, which comes from the Greek word meaning indivisible. In the computer world, atomic means an operation that cannot be interrupted. Consider the assembly language instruction:

mov ax,bx

Since nothing short of a reset can stop or interrupt this instruction, it's atomic. It will start and complete without any interference from other tasks or interrupts

The first part of Rule 1 requires the atomic use of shared variables. Suppose two functions each share the global variable foobar. Function A contains:

temp = foobar;

temp += 1;

foobar = temp;

This code is not reentrant, because foobar is used non-atomically. That is, it takes three statements to change its value, not one. The foobar handling is not indivisible; an interrupt can come between these statements and switch context to the other function, which then may also try and change foobar. Clearly there's a conflict; foobar will wind up with an incorrect value, the autopilot will crash, and hundreds of screaming people will wonder "why didn't they teach those developers about reentrancy?"

Suppose, instead, Function A looks like:

foobar += 1;

Now the operation is atomic, right? An interrupt cannot suspend processing with foobar in a partially changed state, so the routine is reentrant.

Except... do you really know what your C compiler generates? On an x86 processor that statement might compile to:

mov ax,[foobar]

inc ax

mov [foobar],ax

which is clearly not atomic, and so not reentrant. The atomic version is:

inc [foobar]

The moral is to be wary of the compiler. Assume it generates atomic code and you may find "60 Minutes" knocking at your door.

The second part of the first reentrancy rule reads "...unless each is allocated to a specific instance of the function." This is an exception to the atomic rule that skirts the issue of shared variables.

An "instance" is a path through the code. There's no reason a single function can't be called from many other places. In a multitasking environment, it's quite possible that several copies of the function may indeed be executing concurrently. (Suppose the routine is a driver that retrieves data from a queue; many different parts of the code may want queued data more or less simultaneously). Each execution path is an "instance" of the code. Consider:

int foo;
void some_function(void) {
foo++;
}

foo is a global variable whose scope exists outside that of the function. Even if no other routine uses foo, some_function can trash the variable if more than one instance of it runs at any time.

C and C++ can save us from this peril. Use automatic variables. That is, declare foo inside of the function. Then, each instance of the routine will use a new version of foo created from the stack, as follows:

void some_function(void) {
int foo;
foo++;
}

Another option is to dynamically assign memory (using malloc), again so each incarnation uses a unique data area. The fundamental reentrancy problem is thus avoided, as it's impossible for multiple instances to modify a common version of the variable.

Two more rules

The other rules are very simple. Rule 2 tells us a calling function inherits the reentrancy problems of the callee. That makes sense. If other code inside the function trashes shared variables, the system is going to crash. Using a compiled language, though, there's an insidious problem. Are you sure-really sure-that all of the runtime library functions are reentrant? Obviously, string operations and a lot of other complicated things make library calls to do the real work. An awful lot of compilers also generate runtime calls to do, for instance, long math, or even integer multiplications and divisions.

If a function must be reentrant, talk to the compiler vendor to ensure that the entire runtime package is pure. If you buy software packages (like a protocol stack) that may be called from several places, take similar precautions to ensure the purchased routines are also reentrant.

Rule 3 is a uniquely embedded caveat. Hardware looks a lot like a variable; if it takes more than a single I/O operation to handle a device, reentrancy problems can develop.

Consider Zilog's SCC serial controller. Accessing any of the device's internal registers requires two steps: first write the register's address to a port, then read or write the register from the same port, the same I/O address. If an interrupt fires between setting the port and accessing the register, another function might take over and access the device. When control returns to the first function, the register address you set will be incorrect.

Keeping code reentrant

What are our best options for eliminating non-reentrant code? The first rule of thumb is to avoid shared variables. Globals are the source of endless debugging woes and failed code. Use automatic variables or dynamically allocated memory.

Yet globals are also the fastest way to pass data around. It's not always possible to entirely eliminate them from real time systems. So, when using a shared resource (variable or hardware) we must take a different sort of action.

The most common approach is to disable interrupts during non-reentrant code. With interrupts off, the system suddenly becomes a single-process environment. There will be no context switches. Disable interrupts, do the non-reentrant work, and then turn interrupts back on.

Shutting interrupts down does increase system latency, reducing its ability to respond to external events in a timely manner. A kinder, gentler approach is to use a mutex (also known as binary semaphore) to indicate when a resource is busy. Mutexes are simple on-off state indicators whose processing is inherently atomic. These are often used as "in-use" flags to have tasks idle when a shared resource is not available.

Nearly every commercial real-time operating system includes mutexes. If this is your way of achieving reentrant code, by all means use an RTOS.

Recursion

No discussion of reentrancy is complete without mentioning recursion, if only because there's so much confusion between the two.

A function is recursive if it calls itself. That's a classic way to remove iteration from many sorts of algorithms. Given enough stack space, this is a perfectly valid-though tough to debug-way to write code. Since a recursive function calls itself, clearly it must be reentrant to avoid trashing its variables. So all recursive functions must be reentrant, but not all reentrant functions are recursive.

Jack G. Ganssle is a lecturer and consultant on embedded development issues, and a regular contributor toEmbedded Systems Programming. Contact him atjack@ganssle.com.