Kernel Korner - Allocating Memory in the Kernel
来源:互联网 发布:阿里云如何创建快照 编辑:程序博客网 时间:2024/05/20 06:06
Unfortunately for kernel developers, allocating memory in the kernel isnot as simple as allocating memory in user space. A number of factorscontribute to the complication, among them:
The kernel is limited to about 1GB of virtual and physical memory.
The kernel's memory is not pageable.
The kernel usually wants physically contiguous memory.
Often, the kernel must allocate the memory without sleeping.
Mistakes in the kernel have a much higher price than they do elsewhere.
Although easy access to an abundance of memory certainly is not a luxuryto the kernel, a little understanding of the issues can go a long waytoward making the process relatively painless.
The general interface for allocating memory inside of the kernel iskmalloc():#include <linux/slab.h>
void * kmalloc(size_t size, int flags);
It should look familiar—it is pretty much the same as user space'smalloc(), after all—except that it takes a second argument,flags. Let's ignore flags for a secondand see what we recognize. First off, size isthe same here as in malloc()'s—it specifies the size in bytes of theallocation. Upon successful return, kmalloc() returns a pointerto size bytes of memory. The alignment of the allocated memoryis suitable for storage of and access to any type of object. As withmalloc(), kmalloc() can fail, and you mustcheck its return value against NULL. Let's look at an example:
struct falcon *p;
p = kmalloc(sizeof (struct falcon), GFP_KERNEL);
if (!p)
/* the allocation failed - handle appropriately */
The flags field controls the behavior of memory allocation.We can divide flags into three groups: action modifiers, zonemodifiers and types. Action modifiers tell the kernel how toallocate memory. They specify, for example, whether the kernel cansleep (that is, whether the call to kmalloc() can block)in order to satisfy the allocation. Zone modifiers, on the other hand,tell the kernel from where the request should be satisfied. For example,some requests may need to be satisfied from memory that hardware canaccess through direct memory access (DMA). Finally, type flags specify a type of allocation.They group together relevant action and zone modifiers into asingle mnemonic. In general, instead of specifying multiple action andzone modifiers, you specify a single type flag.
Table 1 is a listing of the action modifiers, and Table 2is a listing of the zone modifiers. Many different flags can be used;allocating memory in the kernel is nontrivial. It is possible to controlmany aspects of memory allocation in the kernel.Your code should use the type flags and not the individual actionand zone modifiers. The two most common flags are GFP_ATOMICand GFP_KERNEL. Nearly all of your kernel memory allocationsshould specify one of these two flags.
Table 1. Action Modifiers
Table 2. Zone Modifiers
The GFP_ATOMIC flag instructs the memory allocator never toblock. Use this flag in situations where it cannotsleep—where it must remain atomic—such as interrupt handlers, bottom halvesand process context code that is holding a lock. Because the kernelcannot block the allocation and try to free up sufficient memory tosatisfy the request, an allocation specifying GFP_ATOMIC hasa lesser chance of succeeding than one that does not. Nonetheless, ifyour current context is incapable of sleeping, it is your only choice.Using GFP_ATOMIC is simple:
struct wolf *p;
p = kmalloc(sizeof (struct wolf), GFP_ATOMIC);
if (!p)
/* error */
Conversely, the GFP_KERNEL flag specifies a normal kernelallocation. Use this flag in code executing in process contextwithout any locks. A call to kmalloc() with this flag cansleep; thus, you must use this flag only when it is safe to do so.The kernel utilizes the ability to sleep in order to free memory, if needed.Therefore, allocations that specify this flag have a greater chanceof succeeding. If insufficient memory is available, for example, thekernel can block the requesting code and swap some inactive pages to disk,shrink the in-memory caches, write out buffers and so on.
Sometimes, as when writing an ISA device driver, you need to ensurethat the memory allocated is capable of undergoing DMA.For ISA devices, this is memory in the first 16MB of physicalmemory. To ensure that the kernel allocates from this specific memory, use theGFP_DMA flag. Generally, you would use this flag in conjunctionwith either GFP_ATOMIC or GFP_KERNEL; you can combineflags with a binary OR operation. For example, to instruct the kernelto allocate DMA-capable memory and to sleep if needed, do:
char *buf;
/* we want DMA-capable memory,
* and we can sleep if needed */
buf = kmalloc(BUF_LEN, GFP_DMA | GFP_KERNEL);
if (!buf)
/* error */
Table 3 is a listing of the type flags, and Table 4 shows towhich type flag each action and zone modifier equates.The header <linux/gfp.h> defines all of the flags.
Table 3. Types
Table 4. Composition of the Type Flags
__GFP_DMA
Returning Memory
When you are finished accessing the memory allocated viakmalloc(), you must return it to the kernel. This job isdone using kfree(), which is the counterpart to user space'sfree() library call. The prototype for kfree() is:
#include <linux/slab.h>
void kfree(const void *objp);
kfree()'s usage is identical to the user-space variant. Assume p is apointer to a block of memory obtained via kmalloc(). Thefollowing command, then, would free that block and return the memory to the kernel:
kfree(p);
As with free() in user space, calling kfree() on ablock of memory that already has been freed or on a pointer that isnot an address returned from kmalloc() is a bug, and it can resultin memory corruption. Always balance allocations and frees to ensurethat kfree() is called exactly once on the correct pointer.Calling kfree() on NULL is checkedfor explicitly and is safe, although it is not necessarily a sensible idea.
Let's look at the full allocation and freeing cycle:
struct sausage *s;
s = kmalloc(sizeof (struct sausage), GFP_KERNEL);
if (!s)
return -ENOMEM;
/* ... */
kfree(s);
The kmalloc() function returnsphysically and therefore virtually contiguousmemory. This is a contrast to user space'smalloc() function, which returns virtually butnot necessarily physically contiguous memory.Physically contiguous memory has two primarybenefits. First, many hardware devices cannotaddress virtual memory. Therefore, in order forthem to be able to access a block of memory, theblock must exist as a physically contiguous chunkof memory. Second, a physically contiguous blockof memory can use a single large page mapping.This minimizes the translation lookaside buffer(TLB) overhead of addressing the memory, as only asingle TLB entry is required.
Allocating physically contiguous memory has one downside: it is oftenhard to find physically contiguous blocks of memory, especially forlarge allocations. Allocating memory that is only virtually contiguoushas a much larger chance of success. If you do not needphysically contiguous memory, use vmalloc():
#include <linux/vmalloc.h>
void * vmalloc(unsigned long size);
You then return memory obtained with vmalloc() to the system by usingvfree():
#include <linux/vmalloc.h>
void vfree(void *addr);
Here again, vfree()'s usage is identical to user space's malloc() and free()functions:
struct black_bear *p;
p = vmalloc(sizeof (struct black_bear));
if (!p)
/* error */
/* ... */
vfree(p);
In this particular case, vmalloc() might sleep.
Many allocations in the kernel can use vmalloc(), becausefew allocations need to appear contiguous to hardwaredevices.If you are allocating memory that only software accesses, such as dataassociated with a user process, there is no need for the memory to bephysically contiguous. Nonetheless, few allocations in the kerneluse vmalloc(). Most choose to use kmalloc(), even if it'snot needed, partly for historical and partly for performancereasons. Because the TLB overhead for physically contiguous pagesis reduced greatly, the performance gains often are well appreciated.Despite this, if you need to allocate tens of megabytes of memory inthe kernel, vmalloc() is your best option.
Unlike user-space processes, code executing in the kernel has neithera large nor a dynamically growing stack. Instead, each process in thekernel has a small fixed-size stack. The exact size of the stack isarchitecture-dependent. Most architectures allocate two pages for thestack, so the stack is 8KB on 32-bit machines.
Because of the small stack, allocations that are large, automatic and on-the-stackare discouraged. Indeed, you never should see anything such as this inkernel code:
#define BUF_LEN2048
void rabbit_function(void)
{
char buf[BUF_LEN];
/* ... */
}
Instead, the following is preferred:
#define BUF_LEN2048
void rabbit_function(void)
{
char *buf;
buf = kmalloc(BUF_LEN, GFP_KERNEL);
if (!buf)
/* error! */
/* ... */
}
You also seldom see the equivalent of this stack inuser space, because there is rarely a reasonto perform a dynamic memory allocation when youknow the allocation size at the time you write the code.In the kernel, however, you should use dynamicmemory any time the allocation size is larger than ahandful of bytes or so. This helps prevent stackoverflow, which ruins everyone's day.
Conclusion
With a little understanding, getting a hold of memory in the kernel isdemystified and not too much more difficult to do than it is in user space. A few simplerules of thumb can go a long way:
Decide whether you can sleep (that is, whether the call tokmalloc() can block). If you are in an interrupt handler,in a bottom half, or if you hold a lock, you cannot. If you are inprocess context and do not hold a lock, you probably can.
If you can sleep, specify GFP_KERNEL.
If you cannot sleep, specify GFP_ATOMIC.
If you need DMA-capable memory (for example, for an ISA or brokenPCI device), specify GFP_DMA.
Always check for and handle a NULL returnvalue from kmalloc().
Do not leak memory; make sure you call kfree()somewhere.
Ensure that you do not race and call kfree()multiple times and that you never access a block of memory afteryou free it.
- Kernel Korner - Allocating Memory in the Kernel
- Kernel Korner - Sleeping in the Kernel
- Kernel Korner - Sleeping in the Kernel
- Feature: High Memory In The Linux Kernel
- Feature: High Memory In The Linux Kernel
- High Memory In The Linux Kernel(Cited)
- Kernel Korner - Using DMA
- Kernel Korner - Using DMA
- Detecting Memory Leaks in Kernel
- Sleeping in the Kernel
- Missing dependencies in the kernel
- Understanding the linux kernel-ch2-Memory addressing
- How The Kernel Manages Your Memory
- How The Kernel Manages Your Memory
- How The Kernel Manages Your Memory
- Kernel APIs Timers and lists in the 2.6 kernel
- Kernel Korner - Why and How to Use Netlink Socket
- Kernel Korner - Why and How to Use Netlink Socket
- java读取相对路径文件
- 记一段有意思的Java循环控制语句
- SAA7121视频图像D/A转换芯片
- 无线信号传输机制(物理层?基本没看懂)
- 触发器 INSTEAD OF 和 AFTER 的区别
- Kernel Korner - Allocating Memory in the Kernel
- 初试flex
- vsftpd架设(配置pam模块)
- c#中的委托、事件、Func、Predicate、Observer设计模式以及其他
- 别了5320xm,迎来5230
- 关于Contacts的那点事儿
- 一些国外的BCB编程资源网站
- IL 格式不正确/Bad IL Format
- 常用命令学习