vhost device still attached , ovs crash bug fix

来源:互联网 发布:私人定制软件 编辑:程序博客网 时间:2024/06/10 21:16

转载请注明出处:http://blog.csdn.net/hliyuxin/article/details/51694533
部分用户先删除vm nic, 再创建vm nic, vhost-user port name相同,引发ovs crash.


云环境中,ovs日志:
Feb 26 10:54:27 ovs-vswitchd[4100]: ovs|03614|dpdk|ERR|Can not remove port e-d8a3fe9c, vhost device still attached
Feb 26 10:54:27 ovs-vswitchd[4100]: ovs|00306|ofproto_dpif_upcall(pmd86)|WARN|Dropped 1 log messages in last 175 seconds (most recently, 175 seconds ago) due to excessive rate
Feb 26 10:54:27 ovs-vswitchd[4100]: ovs|00307|ofproto_dpif_upcall(pmd86)|WARN|upcall_cb failure: ukey installation fails
Feb 26 10:54:30 ovs-vswitchd[4100]: ovs|00053|dpdk(vhost_thread2)|INFO|vHost Device ‘/usr/local/var/run/openvswitch/n-db197e3b’ (18) not in dpdk_dev


原因,有用户vm还没有关机退出,强行执行ovs-vsctl del-port命令删除port。这和本身云平台的控制代码逻辑也有一定关系,没有保证ovs先收到qemu 发出的quit信号,后收到del-port信号.

函数 netdev_dpdk_vhost_destruct中:

  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev_);  /* Can't remove a port while a guest is attached to it. */if (netdev_dpdk_get_virtio(dev) != NULL) {    VLOG_ERR("Remove port %s forcibly, vhost device still attached",        netdev_->name);    return;}VLOG_INFO("netdev_dpdk_vhost_destruct, vhost device %s", netdev_->name);if (rte_vhost_driver_unregister(dev->vhost_id)) {    VLOG_ERR("Unable to remove vhost-user socket %s", dev->vhost_id);}ovs_mutex_lock(&dpdk_mutex);list_remove(&dev->list_node);dpdk_mp_put(dev->dpdk_mp);ovs_mutex_unlock(&dpdk_mutex);

函数直接return,导致之后dev->list_node没有删除清理,后面再添加相同name port, 引发错误。
patch修改如下:

diff --git a/lib/netdev-dpdk.c b、lib/netdev-dpdk.cindex ab0ca75..73fa9d5 100644--- a/lib/netdev-dpdk.c+++ b/lib/netdev-dpdk.c@@ -884,9 +884,13 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev_)     /* Can't remove a port while a guest is attached to it. */     if (netdev_dpdk_get_virtio(dev) != NULL) {-        VLOG_ERR("Can not remove port %s, vhost device still attached",+        VLOG_ERR("Remove port %s forcibly, vhost device still attached",            netdev_->name);-        return;++        ovs_mutex_lock(&dev->mutex);+        dev->flags &= ~VIRTIO_DEV_RUNNING;+        ovsrcu_set(&dev->virtio_dev, NULL);+        ovs_mutex_unlock(&dev->mutex);     }     VLOG_INFO("netdev_dpdk_vhost_destruct, vhost device %s", netdev_->name);-- 
0 0