Copy-on-write

来源:互联网 发布:短信平台数据库设计 编辑:程序博客网 时间:2024/05/01 16:29

http://en.wikipedia.org/wiki/Copy-on-write

Copy-on-write

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming.The fundamental idea is that if multiple callers ask for resourceswhich are initially indistinguishable, you can give them pointers tothe same resource. This function can be maintained until a caller triesto modify its "copy" of the resource, at which point a true privatecopy is created to prevent the changes becoming visible to everyoneelse. All of this happens transparently to the callers. The primary advantage is that if a caller never makes any modifications, no private copy need ever be created.

[edit] Copy-on-write in virtual memory

Copy-on-write finds its main use in virtual memory operating systems; when a process creates a copy of itself, the pages in memorythat might be modified by either the process or its copy are markedcopy-on-write. When one process modifies the memory, the operatingsystem's kernel intercepts the operation and copies the memory so that changes in one process's memory are not visible to the other.

Another use is in the callocfunction. This can be implemented by having a page of physical memoryfilled with zeroes. When the memory is allocated, the pages returnedall refer to the page of zeroes and are all marked as copy-on-write.This way, the amount of physical memory allocated for the process doesnot increase until data is written. This is typically only done forlarger allocations.

Copy-on-write can be implemented by telling the MMUthat certain pages in the process's address space are read-only. Whendata is written to these pages, the MMU raises an exception which ishandled by the kernel, which allocates new space in physical memory andmakes the page being written to correspond to that new location inphysical memory.

One major advantage of COW is the ability to use memory sparsely.Because the usage of physical memory only increases as data is storedin it, very efficient hash tablescan be implemented which only use little more physical memory than isnecessary to store the objects they contain. However, such programs runthe risk of running out of virtual address space -- virtual pagesunused by the hash table cannot be used by other parts of the program.The main problem with COW at the kernel level is the complexity itadds, but the concerns are similar to those raised by more basicvirtual memory concerns such as swapping pages to disk; when the kernelwrites to pages, it must copy them if they are marked copy-on-write.

[edit] Other applications of copy-on-write

COW is also used outside the kernel, in library, application and system code. The string class provided by the C++ standard library, for example, was specifically designed to allow copy-on-write implementations.

In multithreaded systems, COW can be implemented without the use of traditional locking and instead use Compare-and-swapto increment or decrement the internal reference counter. Since theoriginal resource will never be altered, it can safely be copied bymultiple threads (after the reference count was increased) without theneed of performance-expensive locking such as mutexes. If the referencecounter turns 0, then by definition only 1 thread is holding areference so the resource can safely be de-allocated from memory, againwithout the use of performance-expensive locking mechanisms. Thebenefit of not having to copy the resource (and the resultingperformance gain over traditional deep-copying) will therefor be validboth in single- as in multithreaded systems.

The COW concept is also used in virtualization/emulation software such as Bochs, QEMU, Linux_vserver and UMLfor virtual disk storage. This allows a great reduction in requireddisk space when multiple VMs can be based on the same hard disk image,as well as increased performance as disk reads can be cached in RAM andsubsequent reads served to other VMs out of the cache.

The COW concept is also used in maintenance of instant snapshot ondatabase servers like Microsoft SQL Server 2005. Instant snapshotspreserve a static view of a database by storing a pre-modification copyof data when underlaying data are updated. Instant snapshots are usedfor testing uses or moment-dependent reports and should not be used toreplace backups.

COW may also be used as the underlying mechanism for snapshots provided by logical volume management and Microsoft Volume Shadow Copy Service.

The copy-on-write technique can be used to emulate a read-write storage on media that require wear levelling or are physically Write Once Read Many.

原创粉丝点击