MongoDB Capped Collections

来源:互联网 发布:局域网视频直播软件 编辑:程序博客网 时间:2024/06/05 08:33

 

Capped collections are fixed sized collections that have a very high performance auto-FIFO age-out feature (age out is based on insertion order). They are a bit like the "RRD" concept if you are familiar with that.

In addition, capped collections automatically, with high performance, maintain insertion order for the objects in the collection; this is very powerful for certain use cases such as logging.

Creating

Unlike a standard collection, you must explicitly create a capped collection, specifying a collection size in bytes. The collection's data space is then preallocated. Note that the size specified includes database headers.

 

Behavior

  • Once the space is fully utilized, newly added objects will replace the oldest objects in the collection.
  • If you perform a find() on the collection with no ordering specified, the objects will always be returned in insertion order.  Reverse order is always retrievable with find().sort({$natural:-1}).

Usage and Restrictions

  • You may insert new objects in the capped collection.
  • You may update the existing objects in the collection. However, the objects must not grow in size. If they do, the update will fail. (There are some possible workarounds which involve pre-padding objects; contact us in the support forums for more information, if help is needed.)
  • The database does not allow deleting objects from a capped collection. Use the drop() method to remove all rows from the collection. 
    Note: After the drop you must explicitly recreate the collection.
  • Maximum size for a capped collection is currently 1e9 bytes on a thirty-two bit machine. The maximum size of a capped collection on a sixty-four bit machine is constrained only by system resources.

Applications

  • Logging. Capped collections provide a high-performance means for storing logging documents in the database. Inserting objects in an unindexed capped collection will be close to the speed of logging to a filesystem. Additionally, with the built-in FIFO mechanism, you are not at risk of using excessive disk space for the logging.
  • Caching. If you wish to cache a small number of objects in the database, perhaps cached computations of information, the capped tables provide a convenient mechanism for this. Note that for this application you will typically want to use an index on the capped table as there will be more reads than writes.
  • Auto Archiving. If you know you want data to automatically "roll out" over time as it ages, a capped collection can be an easier way to support than writing manual archival cron scripts.

Recommendations

  • For maximum performance, do not create indexes on a capped collection. If the collection will be written to much more than it is read from, it is better to have no indexes. Note that you may create indexes on a capped collection; however, you are then moving from "log speed" inserts to "database speed" inserts -- that is, it will still be quite fast by database standards.
  • Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is (somewhat) analogous to tail on a log file.

Options

size. The size of the capped collection. This must be specified.

max

You may also optionally cap the number of objects in the collection. Once the limit is reached, items roll out on a least recently inserted basis.

To cap on number of objects, specify a max: parameter on the createCollection() call.

Note: When specifying a cap on the number of objects, you must also cap on size. Be sure to leave enough room for your chosen number of objects or items will roll out faster than expected. You can use the validate() utility method to see how much space an existing collection uses, and from that estimate your size needs.

Note: Capped collections are always capped by size and hence also limiting by number of documents is an overhead. Limiting by just size is faster.

 

Tip: When programming, a handy way to store the most recently generated version of an object can be a collection capped with max=1.

autoIndexId

The autoIndexId field may be set to true or false to explicitly enable or disable automatic creation of a unique key index on the _id object field. By default, such an index is is not created for capped collections.

An index is not automatically created on _id for capped collections by default

If you will be using the _id field, you should create an index on _id.

Preallocating space for a normal collection

The createCollection command may be used for non capped collections as well. For example:

 

Explicitly creating a non capped collection via createCollection allows parameters of the new collection to be specified. For example, specification of a collection size causes the corresponding amount of disk space to be preallocated for use by the collection.

Sharding

Capped collections are not shardable.

Check if a collection is capped

You can check if a collection is capped by using the isCapped() shell function. db.foo.isCapped()

Convert a collection to capped

You can convert a (non-capped) collection to a capped collection with the convertToCapped command:

 
官方文档:http://www.mongodb.org/display/DOCS/Capped+Collections