A brief introduction to XenStore: Format and Interface

来源:互联网 发布:怎么登录淘宝店铺 编辑:程序博客网 时间:2024/04/29 21:23

About

 

This document describes the format of the entries in XenStore, how and what they're used for, and how third-party apps should use XenStore as a management interface.

 

Overview

 

XenStoreis a hierarchical namespace (similar to sysfs or Open Firmware) whichis shared between domains. The interdomain communication primitivesexposed by Xen are very low-level (virtual IRQ and shared memory). XenStoreis implemented on top of these primitives and provides some higherlevel operations (read a key, write a key, enumerate a directory,notify when a key changes value).

XenStoreis a database, hosted by domain 0, that supports transactions andatomic operations. It's accessible by either a Unix domain socket inDomain-0, a kernel-level API, or an ioctl interface via/proc/xen/xenbus. XenStore should always be accessed through the functions defined in <xs.h>. XenStoreis used to store information about the domains during their executionand as a mechanism of creating and controlling Domain-U devices.

XenBus is the in-kernel API used by virtual IO drivers to interact with XenStore.

 

General Format

 

There are three main paths in XenStore:

  • /vm - stores configuration information about domain

  • /local/domain - stores information about the domain on the local node (domid, etc.)

  • /tool - stores information for the various tools

 

/vm

 

The /vm path stores configuration information for a domain. This information doesn't change and is indexed by the domain's UUID. A /vm entry contains the following information:

  • ssidref - ssid reference for domain

  • uuid - uuid of the domain (somewhat redundant)

  • on_reboot - the action to take on a domain reboot request (destroy or restart)

  • on_poweroff - the action to take on a domain halt request (destroy or restart)

  • on_crash - the action to take on a domain crash (destroy or restart)

  • vcpus - the number of allocated vcpus for the domain

  • memory - the amount of memory (in megabytes) for the domain Note: appears to sometimes be empty for domain-0

  • vcpu_avail - the number of active vcpus for the domain (vcpus - number of disabled vcpus)

  • name - the name of the domain

 

/vm/<uuid>/image/

 

The image path is only available for Domain-Us and contains:

  • ostype - identifies the builder type (linux or vmx)

  • kernel - path to kernel on domain-0

  • cmdline - command line to pass to domain-U kernel

  • ramdisk - path to ramdisk on domain-0

 

/local

 

The /local path currently only contains one directory, /local/domainthat is indexed by domain id. It contains the running domaininformation. The reason to have two storage areas is that duringmigration, the uuid doesn't change but the domain id does. The /local/domain directory can be created and populated before finalizing the migration enabling localhost=>localhost migration.

 

/local/domain/<domid>

 

This path contains:

  • cpu_time - xend start time (this is only around for domain-0)

  • handle - private handle for xend

  • name - see /vm

  • on_reboot - see /vm

  • on_poweroff - see /vm

  • on_crash - see /vm

  • vm - the path to the VM directory for the domain

  • domid - the domain id (somewhat redundant)

  • running - indicates that the domain is currently running

  • memory/ - a directory for memory information

    • target - target memory size for the domain (in kilobytes)

  • cpu - the current CPU the domain is pinned to (empty for domain-0?)

  • cpu_weight - the weight assigned to the domain

  • vcpu_avail - a bitmap telling the domain whether it may use a given VCPU

  • online_vcpus - how many vcpus are currently online

  • vcpus - the total number of vcpus allocated to the domain

  • console/ - a directory for console information

    • ring-ref - the grant table reference of the console ring queue

    • port - the event channel being used for the console ring queue (local port)

    • tty - the current tty the console data is being exposed of

    • limit - the limit (in bytes) of console data to buffer

  • backend/ - a directory containing all backends the domain hosts

    • vbd/ - a directory containing vbd backends

      • <domid>/ - a directory containing vbd's for domid

        • <virtual-device>/ - a directory for a particular virtual-device on domid

          • frontend-id - domain id of frontend

          • frontend - the path to the frontend domain

          • physical-device - backend device number

          • sector-size - backend sector size

          • sectors - backend number of sectors

          • info - device information flags. 1=cdrom, 2=removable, 4=read-only

          • domain - name of frontend domain

          • params - parameters for device

          • type - the type of the device

          • dev - frontend virtual device (as given by the user)

          • node - backend device node (output from block creation script)

          • hotplug-status - connected or error (output from block creation script)

          • state - communication state across XenBus to the frontend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed

    • vif/ - a directory containing vif backends

      • <domid>/ - a directory containing vif's for domid

        • <vif number>/ - a directory for each vif

          • frontend-id - the domain id of the frontend

          • frontend - the path to the frontend

          • mac - the mac address of the vif

          • bridge - the bridge the vif is connected to

          • handle - the handle of the vif

          • script - the script used to create/stop the vif

          • domain - the name of the frontend

          • hotplug-status - connected or error (output from block creation script)

          • state - communication state across XenBus to the frontend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed

  • device/ - a directory containing the frontend devices for the domain

    • vbd/ - a directory containing vbd frontend devices for the domain

      • <virtual-device>/ - a directory containing the vbd frontend for virtual-device

        • virtual-device - the device number of the frontend device

        • device-type - the device type ("disk", "cdrom", "floppy")

        • backend-id - the domain id of the backend

        • backend - the path of the backend in the store (/local/domain path)

        • ring-ref - the grant table reference for the block request ring queue

        • event-channel - the event channel used for the block request ring queue

        • state - communication state across XenBus to the backend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed

    • vif/ - a directory containing vif frontend devices for the domain

      • <id>/ - a directory for vif id frontend device for the domain

        • backend-id - the backend domain id

        • mac - the mac address of the vif

        • handle - the internal vif handle

        • backend - a path to the backend's store entry

        • tx-ring-ref - the grant table reference for the transmission ring queue

        • rx-ring-ref - the grant table reference for the receiving ring queue

        • event-channel - the event channel used for the two ring queues

        • state - communication state across XenBus to the backend. 0=unknown, 1=initialising, 2=init. wait, 3=initialised, 4=connected, 5=closing, 6=closed

  • device-misc/ - miscellanous information for devices

    • vif/ - miscellanous information for vif devices

      • nextDeviceID - the next device id to use

  • store/ - per-domain information for the store

    • port - the event channel used for the store ring queue

    • ring-ref - the grant table reference used for the store's communication channel

  • image - private xend information

 

Interacting with the XenStore

 

The XenStoreinterface provides transaction based reads and writes to points in thexenstore hierarchy. Watches can be set at points in the hierarchy andan individual watch will be triggered when anything at or below thatpoint in the hierachy changes. A watch is registered with a callbackfunction and a "token". The "token" can be a pointer to any piece ofdata. The callback function is invoked with the of the changed node andthe "token".

Theinterface is centered around the idea of a central polling loop thatreads watches, providing the path, callback, and token, and invokingthe callback.

 

API Usage Examples

 

These code snippets should provide a helpful starting point.

 

C

 

 

struct xs_handle *xs;
xs_transaction_t th;
char *path;
int fd;
fd_set set;
int er;
struct timeval tv = {.tv_sec = 0, .tv_usec = 0 };
char **vec;
unsigned int num_strings;
char * buf;
unsigned int len;
/* Get a connection to the daemon */
xs = xs_daemon_open();
if ( xs == NULL ) error();
/* Get the local domain path */
path = xs_get_domain_path(xs, domid);
if ( path == NULL ) error();
/* Make space for our node on the path */
path = realloc(path, strlen(path) + strlen("/mynode") + 1);
if ( path == NULL ) error();
strcat(path, "/mynode");
/* Create a watch on /local/domain/%d/mynode. */
er = xs_watch(xs, path, "mytoken");
if ( er == 0 ) error();
/* We are notified of read availability on the watch via the
* file descriptor.
*/
fd = xs_fileno(xs);
while (1)
{
/* TODO (TimPost), show a simpler example with poll()
* in a modular style, using a simple callback. Most
* people think 'inotify' when they see 'watches'. */
FD_ZERO(&set);
FD_SET(fd, &set);
/* Poll for data. */
if ( select(fd + 1, &set, NULL, NULL, &tv) > 0
&& FD_ISSET(fd, &set))
{
/* num_strings will be set to the number of elements in vec
* (typically, 2 - the watched path and the token) */
vec = xs_read_watch(xs, &num_strings);
if ( !vec ) error();
printf("vec contents: %s|%s/n", vec[XS_WATCH_PATH],
vec[XS_WATCH_TOKEN]);
/* Prepare a transaction and do a read. */
th = xs_transaction_start(xs);
buf = xs_read(xs, th, vec[XS_WATCH_PATH], &len);
xs_transaction_end(xs, th);
if ( buf )
{
printf("buflen: %d/nbuf: %s/n", len, buf);
}
/* Prepare a transaction and do a write. */
th = xs_transaction_start(xs);
er = xs_write(xs, th, path, "somestuff", strlen("somestuff"));
xs_transaction_end(xs);
if ( er == 0 ) error();
}
}
/* Cleanup */
close(fd);
xs_daemon_close(xs);
free(path);

 

 

Python

 

 

Toggle line numbers
   1 # xsutil provides access to xshandle() which allows you to use something closer to the C-style API,
2 # however it does not support polling in the same manner.
3 from xen.xend.xenstore.xsutil import *
4 # xswatch provides a callback interface for the watches. I similar interface exists for C within xenbus.
5 from xen.xend.xenstore.xswatch import *
6 xs = xshandle() # From xsutil
7 path = xs.get_domain_path() + "/mynode"
8 # Watch functions take the path as the first argument
9 # all other arguments that are passed via the xswatch are also included.
10 def watch_func(path, xs):
11 # Read the data
12 th = xs.transaction_start()
13 buf = xs.read(th, path)
14 xs.transaction_end(th)
15 log.info("Got %s" % buf)
16 # Write back
17 th = xs.transaction_start()
18 xs.write(th, path, "somestuff")
19 xs.transaction_end(th)
20 mywatch = xswatch(path, xs)

 

You can use direct Read/Write or gather calls via xstransact.

Bydefault the python xsutil.xshandle() is a shared global handle. xswatchuses this handle with a blocking read_watch call. Because theread_watch function is protected by a per-handle mutex, multiple callswill be interleaved and you probably do not want this behavior. If youwould like a blocking mechanism, you might consider introducing asemaphore in the callback function that can be used to block codeexecution. You need to be sure to handle failure cases and not blockindefinitely. For instance, the "@releaseDomain" watch will betriggered on domain destruction for watches within the /local/domain/*trees.

It is also possible -- currently indirectly -- to get a fresh XenStorehandle within python and block on read_watch in the main executionpath. This may be necessary if you want to block waiting for a XenStore node value in a code path initialed by an xswatch callback.

 

N.B.: Changes subject to http://wiki.xensource.com/xenwiki/XenStoreReference