xen块设备体系结构(2)

来源:互联网 发布:恒大网络监控员 编辑:程序博客网 时间:2024/05/22 03:08

xenbus



http://wiki.xen.org/xenwiki/XenBus
http://apps.hi.baidu.com/share/detail/30800929
http://hi.baidu.com/mars208/blog/item/533ace1b0f244be3e0fe0b98.html
http://wiki.xensource.com/xenwiki/XenIntro

XenBus provides a bus abstraction for paravirtualized drivers to communicate between domains. In practice, the bus is used for configuration negotiation, leaving most data transfer to be done via an interdomain channel composed of a shared page and an event channel. 

Xenstore is a centralized configuration database that is accessible by all domains. Management tools configure and control virtual devices by writing values into the database that trigger events in drivers. 


所有的设备,如果要使用xenbus,必须首先注册。xenbus_register_driver_common替代了之前的xenbus_register_driver函数。__xenbus_register_frontend,__xenbus_register_backend都调用xenbus_register_driver_common来分别注册前端和后端设备。(我越来越觉得xenbus就像个xen里的sysfs实现)

xenbus对这些设备限制了统一的驱动结构 xenbus_driver,注意xenbus.h里的几个结构体 xenbus_device, xenbus_driver, xen_bus_type。都是内部封装了linux驱动的结构:device, device_driver, bus_type。还有一个与xenstore有关系的xenbus_watch用来watch xenstore里面node值的变化后调用xenbus_watch内的callback函数。

xenbus_register_driver_common会调用driver_register函数,接着调用linux内核的bus_add_driver把设备的驱动注册到总线上。在dom0 pvops里,可以看到/sys/bus/xen-backend的总线信息,该后端总线上注册了所有的后端vif, vbd的设备。相应的,guest vm里,可以看到/sys/bus/xen的总线信息,这个是前端总线注册上去的。pvdriver会调用__xenbus_register_frontend来注册前端驱动。


Connecting Devices with XenBus

The XenBus, in the context of device drivers, is an informal protocol built on top of the XenStore, which provides a way of enumerating the (virtual) devices available to a given domain, and connecting to them. Implementing the XenBus interface is not required when porting a kernel to Xen. It is predominantly used in Linux to isolate the Xen-specific code behind a relatively abstract interface.

The XenBus interface is intended to roughly mirror that of a device bus such as PCI. It is defined inlinux-2.6-xen-sparse/include/xen/xenbus.h2. Each virtual device has three major components:

  • A shared memory page containing the ring buffers
  • An event channel signaling activity in the ring
  • A XenStore entry containing configuration information

These components are tied together in the bus interface by the structure shown in Listing 6.1. This device structure is passed as an argument to a number of other functions, which (between them) implement the driver for the device.

Listing 6.1. The structure defining a XenBus device

[from: linux-2.6-xensparse/include/xen/xenbus.h]

71 struct xenbus_device {72     const char *devicetype;73     const char *nodename;74     const char *otherend;75     int otherend_id;76     struct xenbus_watch otherend_watch;77     struct device dev;78     enum xenbus_state state;79     struct completion down;80 };

The exact definition of this structure is tied quite closely to Linux; the device struct, for example, represents a Linux device. It can, however, be used as a good starting point for building a similar abstraction layer for other systems.

The core component of the XenBus interface, indeed the only part that needs to be implemented by all systems wanting to use the paravirtualized devices available to Xen guests, is the xenbus_stateenumerated type. Each device has such a type associated with it.

The XenBus state, unlike the rest of the XenBus interface, is defined by Xen, in the io/xenbus.hpublic header. This is used while negotiating a connection between the two halves of the device driver. There are seven states defined. In normal operation, the state should be gradually incremented as the device is initialized, connected, and then disconnected. The possible states are:

  • XenbusStateUnknown represents the initial state of the device on the bus, before either end has been connected.
  • XenbusStateInitialising is the state while the back end is in process of initializing itself.
  • XenbusStateInitWait should be entered by the back end while it is waiting for information before completing initialization. The source of the information can be hot-plug notifications within the Domain 0 kernel, or further information from the connecting guest. The meaning of this state is that the driver itself is initialized, but needs more information before it can be connected to.
  • XenbusStateInitialised should be set to indicate that the back end is now ready for connection. After the bus is in this state, the front end may proceed to connect.
  • XenbusStateConnected is the normal state of the bus. For most of the duration of a guest's run, the bus will be in this state indicating that the front and back ends are communicating normally.
  • XenbusStateClosing is set to indicate that the device has become unavailable. The front and back halves of the driver are still connected at this point, but the back end is no longer doing anything sensible with the commands received from the front. When this state is entered, the front end should begin a graceful shutdown.
  • XenbusStateClosed is the final state, once the two halves of the driver have disconnected from each other.

Not all drivers make use of the XenBus mechanism. The two notable exceptions are the console and XenStore. Both of these are mapped directly from the start info page, and have no information in the XenStore. In the case of the console, this is so that a guest kernel can start outputting debugging information to the console as soon as possible. In the case of the XenStore, it is obvious that the device can't use XenBus, because XenBus is built on top of the XenStore, and the XenStore cannot be used to get information required to map itself.

xenstore

xenstore没啥好讲的,就是一个tree,非常类似linux的proc文件系统。xenbus通过xenbus_xxxx的接口去读写xenstore,所以xenbus是在xenstore之上的驱动层。xenstore有一个watch的功能非常有用,xenbus可以注册一个xenbus_watch_t的结构,去watch一个xenstore里的node。一旦这个node变化了就会触发注册结构里的callback函数去执行。前端后端驱动的probe就是通过这种方法被触发的


xenbus <--> xenstore

xenbus和xenstore通信的代码主要在pvops/drivers/xen/xenbus/目录下的xenbus_xs.c xenbus_comms.c

xenbus驱动在初始化时,会调用xs_init函数,xs_init调用xs_init_comms先初始化xenbus和xenstore通信的两个ring。最后通过kthread_run调用xenwatch_thread, xenbus_thread两个内核线程。通过ps可以看到[xenwatch] [xenbus]。


[xenwatch]线程会调用wait_event_interruptible来监控watches_events链表,所有需要watch的xenstore node都在这个watches_events里面。对应的wait_queue是watch_events_waitq。OK, 下面遍历开始,对watch_events里每一个list_head的结构,通过list_entry找到包含他的xs_stored_msg,调用xs_stored_msg.u.watch.handle->callback函数。

[xenbus]线程循环调用process_msg,前面我们知道,xenbus和xenstore通过两个ring通信,如果ring上有xenstore返回给xenbus的rsp,这时break出循环开始处理rsp。首先kmalloc一块xs_stored_msg的内存,先读取msg的head,然后msg的body。如何msg是一个watch event, 把event挂到watch_events中,然后wake_up watch_events_waitq(这玩意儿像个sem),如果是xenstore的reply,就挂到xs_state.reply_list中,并wake_up reply_waitq。

xenbus_xs.c中,xs_talkv是xenbus和xenstore通信的核心函数,xs_talkv调用xb_write把请求发到xenstore,调用read_reply接受xenstore的rsp。

xenbus_directory, xenbus_read, xenbus_write, xenbus_exists, xenbus_rm, xenbus_mkdir等一系列xenbus_xxxx函数都是基于上述方法

xenbus_comms.c中,主要都是底层的通信实现


xenbus <--> frontend, backend

xenbus与前后端设备打交道也是其重要功能之一。无论前端还是后端设备,在xenbus看来都是一个xenbus_device结构。


xenbus_client.c

xenbus_grant_ring:前(后)端给后(前)端设备某个ring的access

__xenbus_switch_state:改变前(后)端设备的state

xenbus_alloc_evtchn:为前(后)端和后(前)端设备通信建立event channel。其中本地domain是DOMID_SELF,remote_domain是dev->otherend_id,最后调HYPERVISOR_event_channel_op

xenbus_bind_evtchn:为本地event channel绑定一个remote port。我们知道event channel一般有两个port,传进来bind_interdomain.remote_dom = dev->otherend_id,和bind_interdomain.remote_port。通过调用HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain),得到本地的port为bind_interdomain.local_port

xenbus_map_ring_valloc:授权dev->otherend_id的domain来访问一个page。该page是通过xen_alloc_vm_area(PAGE_SIZE)生成出来的。

xenbus_map_ring:授权dev->otherend_id的domain来访问vaddr所在的内存


xenbus_probe.c

xenbus_read_otherend_details:前(后)端调用xenbus_gather收集后(前)端信息。xenbus_gather最终调用xenbus_read,通过xenstore读出。

xenbus_otherend_changed:调用xenbus_read_driver_state,读取另一端设备状态

xenbus_dev_probe:传入的struct device结构__dev为本地端设备,通过to_xenbus_device,to_xenbus_driver得到struct xenbus_device *dev, struct xenbus_driver *drv结构。接着调用talk_to_otherend得到另一端设备信息,最后调用drv->probe。其实核心还是drv->probe,用来发现dev设备是不是在xenbus上(??这个我不太确定)

xenbus_dev_remove:核心也是调用drv->remove函数,其中drv是传入的_dev对应的xenbus_driver。之后把dev对应的xenbus state置为XenbusStateClosed

xenbus_dev_shutdown:调用xenbus_switch_state,修改dev的xenbus状态

get_device, put_device:dev对应的引用计数

xenbus_probe_node:基于nodename,new一片内存并生成一个xenbus_device结构体。最后调用device_register注册该设备。

xenbus_probe_device_type:探测bus下设备的类型


xenbus_init

xenbus_init是xenbus的初始化函数(废话,看名字就知道啦)。首先判断下是不是xen_initial_domain,如果在dom0里执行,要做一系列初始化的工作,包括

申请一个page的空间,用于和xenstore通信,并且把地址写到start_info那个共享page中

调用HYPERVISOR_event_channel_op,创建一个event channel,并把event channel其中一个port写到start_info里面。这样xenstore就可以去bind这个port

如果在domU里执行,继续判断是否是hvm,无论是否hvm,最终都为了获取到xenbus和xenstore通信的三个数据结构:共享ring的页空间,event channel的port,并基于该空间生成xenstore_domain_interface结构。

最后调用xs_init,初始化和xenstore通信的工作


xenbus <---> frontend


static struct xen_bus_type xenbus_frontend = {
.root = "device",
.levels = 2, /* device/type/<id> */
.get_bus_id = frontend_bus_id,
.probe = xenbus_probe_frontend,
.otherend_changed = backend_changed,
.bus = {
.name = "xen",
.match = xenbus_match,
.uevent = xenbus_uevent_frontend,
.probe = xenbus_dev_probe,
.remove = xenbus_dev_remove,
.shutdown = xenbus_dev_shutdown,
.dev_attrs = xenbus_frontend_dev_attrs,
.pm = &xenbus_pm_ops,
},
};

domU里可以看到/sysfs/bus/xen的前端xenbus总线设备


wait_for_devices

/*
 * On a 5-minute timeout, wait for all devices currently configured.  We need
 * to do this to guarantee that the filesystems and / or network devices
 * needed for boot are available, before we can allow the boot to proceed.
 */

xenbus <--> backend

static struct xen_bus_type xenbus_backend = {
.root = "backend",
.levels = 3, /* backend/type/<frontend>/<id> */
.get_bus_id = backend_bus_id,
.probe = xenbus_probe_backend,
.otherend_changed = frontend_changed,
.bus = {
.name      = "xen-backend",
.match     = xenbus_match,
.uevent    = xenbus_uevent_backend,
.probe     = xenbus_dev_probe,
.remove    = xenbus_dev_remove,
.shutdown  = xenbus_dev_shutdown,
.dev_attrs = xenbus_backend_dev_attrs,
},
};

xenbus_probe_backend.c和xenbus_probe_frontend.c几乎雷同