Designing Data Storage Architecture - Understanding Block Blobs and Page Blobs

来源:互联网 发布:ib课程网络教学 编辑:程序博客网 时间:2024/04/30 03:05

Understanding Block Blobs and Page Blobs

http://msdn.microsoft.com/en-us/library/ee691964.aspx

Updated: September 30, 2011

The storage service offers two types of blobs, block blobs and page blobs. You specify the blob type when you create the blob. Once the blob has been created, its type cannot be changed, and it can be updated only by using operations appropriate for that blob type, i.e., writing a block or list of blocks to a block blob, and writing pages to a page blob.

All blobs reflect committed changes immediately. Each version of the blob has a unique tag, called an ETag, that you can use with access conditions to assure you only change a specific instance of the blob.

Any blob can be leased for exclusive write access. When a blob is leased, only calls that include the current lease ID can modify the blob or (for block blobs) its blocks.

Any blob can be duplicated in a snapshot. For information about snapshots, see Working with Snapshots.

noteNoteBlobs in the Windows Azure storage emulator are limited to 2 GB.

About Block Blobs

Block blobs let you upload large blobs efficiently. Block blobs are comprised of blocks, each of which is identified by ablock ID. You create or modify a block blob by writing a set of blocks and committing them by their block IDs. Each block can be a different size, up to a maximum of 4 MB. Themaximum size for a block blob is 200 GB, and a block blob can include no more than 50,000 blocks. If you are writing a block blob that is no more than64 MB in size, you can upload it in its entirety with asingle write operation. (Storage clients default to 32 MB, settable using the SingleBlobUploadThresholdInBytes property.)

When you upload a block to a blob in your storage account, it is associated with the specified block blob, but it does not become part of the blob until you commit a list of blocks that includes the new block's ID. New blocks remain in an uncommitted state until they are specifically committed or discarded. Writing a block does not update the last modified timeof an existing blob.

Block blobs include features that help youmanage large files over networks. With a block blob, you canupload multiple blocks in parallel to decrease upload time. Each block can include anMD5 hash to verify the transfer, so you can track upload progress and re-send blocks as needed. You can upload blocks in any order, and determine their sequence in the final block list commitment step. You can also upload a new block to replace an existing uncommitted block of the same block ID. You haveone week to commit blocks to a blob before they are discarded. All uncommitted blocks are also discarded when a block list commitment operation occurs but does not include them.

You can modify an existing block blob by inserting, replacing, or deleting existing blocks. After uploading the block or blocks that have changed, you can commit a new version of the blob by committing the new blocks with the existing blocks you want to keep using a single commit operation. To insert the same range of bytes in two different locations of the committed blob, you can commit the same block in two places within the same commit operation. For any commit operation, if any block is not found, the entire commitment operation fails with an error, and the blob is not modified. Any block commitment overwrites the blob’s existing properties and metadata, and discards all uncommitted blocks.

Block IDs are strings of equal length within a blob. Block client code usually uses base-64 encoding to normalize strings into equal lengths. When using base-64 encoding, the pre-encoded string must be 64 bytes or less. Block ID values can be duplicated in different blobs. A blob can have up to 100,000 uncommitted blocks, but their total size cannot exceed 400 GB.

If you write a block for a blob that does not exist, a new block blob is created, with a length of zero bytes. This blob will appear in blob lists that include uncommitted blobs. If you don’t commit any block to this blob, it and its uncommitted blocks will be discarded one week after the last successful block upload. All uncommitted blocks are also discarded when a new blob of the same name is created using a single step (rather than the two-step block upload-then-commit process).

About Page Blobs

Page blobs are a collection of 512-byte pages optimized for random read and write operations. To create a page blob, you initialize the page blob and specify the maximum size the page blob will grow. To add or update the contents of a page blob, you write a page or pages by specifying an offset and a range that align to 512-byte page boundaries. A write to a page blob can overwrite just one page, some pages, or up to 4 MB of the page blob. Writes to page blobs happen in-place and are immediately committed to the blob. The maximum size for a page blob is 1 TB.