为什么nova计算节点上报的剩余磁盘空间为负数?
来源:互联网 发布:新网互联域名如何续费 编辑:程序博客网 时间:2024/06/06 04:07
<span style="font-family: Tahoma; text-align: -webkit-auto; background-color: rgb(255, 255, 255);">注:本文针对Kilo版本。</span>
在使用openstack时,遇到了计算节点上报的可用磁盘空间为负数的情况,这里通过代码走读来一窥究竟。
在计算节点上运行的nova-compute服务中,由一个周期任务update_available_resource来负责资源统计和上报:
@<strong>periodic_task</strong>.<strong>periodic_task</strong> def <span style="color:#000066;">update_available_resource</span>(self, context): """See driver.get_available_resource() Periodic process that keeps that the compute host's understanding of resource availability and usage in sync with the underlying hypervisor. :param context: security context """
这个函数中,调用的是ResourceTracker的接口获取可用资源:
rt = self._get_resource_tracker(nodename) rt.<strong>update_available_resource</strong>(context)
而ResourceTracker又是实际调用libvirt driver来进行资源统计信息的获取:
def <strong>update_available_resource</strong>(self, context): """Override in-memory calculations of compute node resource usage based on data audited from the hypervisor layer. Add in resource claims in progress to account for operations that have declared a need for resources, but not necessarily retrieved them from the hypervisor layer yet. """ LOG.info(_LI("Auditing locally available compute resources for " "node %(node)s"), {'node': self.nodename}) resources = <strong>self.driver.get_available_resource</strong>(self.nodename)
这个获取资源统计信息的函数定义在virt\libvirt\driver.py中:
def <strong>get_available_resource</strong>(self, nodename): """Retrieve resource information. This method is called when nova-compute launches, and as part of a periodic task that records the results in the DB. :param nodename: will be put in PCI device :returns: dictionary containing resource info """ disk_info_dict = self._get_local_gb_info() data = {} # NOTE(dprince): calling capabilities before getVersion works around # an initialization issue with some versions of Libvirt (1.0.5.5). # See: https://bugzilla.redhat.com/show_bug.cgi?id=1000116 # See: https://bugs.launchpad.net/nova/+bug/1215593 # Temporary convert supported_instances into a string, while keeping # the RPC version as JSON. Can be changed when RPC broadcast is removed data["supported_instances"] = jsonutils.dumps( self._get_instance_capabilities()) data["vcpus"] = self._get_vcpu_total() data["memory_mb"] = self._get_memory_mb_total() data["local_gb"] = disk_info_dict['total'] data["vcpus_used"] = self._get_vcpu_used() data["memory_mb_used"] = self._get_memory_mb_used() data["local_gb_used"] = disk_info_dict['used'] data["hypervisor_type"] = self._host.get_driver_type() data["hypervisor_version"] = self._host.get_version() data["hypervisor_hostname"] = self._host.get_hostname() # TODO(berrange): why do we bother converting the # libvirt capabilities XML into a special JSON format ? # The data format is different across all the drivers # so we could just return the raw capabilities XML # which 'compare_cpu' could use directly # # That said, arch_filter.py now seems to rely on # the libvirt drivers format which suggests this # data format needs to be standardized across drivers data["cpu_info"] = jsonutils.dumps(self._get_cpu_info()) disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi data['pci_passthrough_devices'] = \ self._get_pci_passthrough_devices() numa_topology = self._get_host_numa_topology() if numa_topology: data['numa_topology'] = numa_topology._to_json() else: data['numa_topology'] = None return data
看一下跟磁盘资源相关的部分,首先是调用了libvirt driver的这个静态函数,得到total/free/used三个值,以gigabytes为单位:
@staticmethod def get_local_gb_info(): """Get local storage info of the compute node in GB. :returns: A dict containing: :total: How big the overall usable filesystem is (in gigabytes) :free: How much space is free (in gigabytes) :used: How much space is used (in gigabytes) """ if CONF.libvirt.images_type == 'lvm': info = libvirt_utils.get_volume_group_info(CONF.libvirt.images_volume_group) else: info = libvirt_utils.get_fs_info(CONF.instances_path) for (k, v) in info.iteritems(): info[k] = v / units.Gi //注意:这里把结果的单位都换算成了GB! return info
从get_local_gb_info这个函数中可以看到,如果存放instances用的是文件系统而非lvm,则调用下面的函数获取资源数据:
def get_fs_info(path): """Get free/used/total space info for a filesystem :param path: Any dirent on the filesystem :returns: A dict containing: :free: How much space is free (in bytes) :used: How much space is used (in bytes) :total: How big the filesystem is (in bytes) """ hddinfo = os.statvfs(path) total = hddinfo.f_frsize * hddinfo.f_blocks free = hddinfo.f_frsize * hddinfo.f_bavail used = hddinfo.f_frsize * (hddinfo.f_blocks - hddinfo.f_bfree) return {'total': total, 'free': free, 'used': used}
get_fs_info这个函数获取到的信息和用df命令看到的结果基本是一样的:
[root@host123 ~]# python
Python 2.7.5 (default, Feb 11 2014, 07:46:25)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> hddinfo = os.statvfs("/var/lib/nova")
>>> total = hddinfo.f_frsize * hddinfo.f_blocks
>>> free = hddinfo.f_frsize * hddinfo.f_bavail
>>> used = hddinfo.f_frsize * (hddinfo.f_blocks - hddinfo.f_bfree)
>>>
>>> print total/1024/1024/1024
254
>>> print free/1024/1024/1024
194
>>> print used/1024/1024/1024
46
[root@host123 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_sys-lv_root 20G 3.6G 16G 20% /
devtmpfs 11G 0 11G 0% /dev
tmpfs 12G 0 12G 0% /dev/shm
tmpfs 12G 83M 12G 1% /run
tmpfs 12G 0 12G 0% /sys/fs/cgroup
/dev/sda1 380M 96M 260M 27% /boot
/dev/mapper/vg_nova-lv_nova 255G 47G 195G 20% /var/lib/nova
Python 2.7.5 (default, Feb 11 2014, 07:46:25)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> hddinfo = os.statvfs("/var/lib/nova")
>>> total = hddinfo.f_frsize * hddinfo.f_blocks
>>> free = hddinfo.f_frsize * hddinfo.f_bavail
>>> used = hddinfo.f_frsize * (hddinfo.f_blocks - hddinfo.f_bfree)
>>>
>>> print total/1024/1024/1024
254
>>> print free/1024/1024/1024
194
>>> print used/1024/1024/1024
46
[root@host123 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_sys-lv_root 20G 3.6G 16G 20% /
devtmpfs 11G 0 11G 0% /dev
tmpfs 12G 0 12G 0% /dev/shm
tmpfs 12G 83M 12G 1% /run
tmpfs 12G 0 12G 0% /sys/fs/cgroup
/dev/sda1 380M 96M 260M 27% /boot
/dev/mapper/vg_nova-lv_nova 255G 47G 195G 20% /var/lib/nova
update_status直接利用了获取到的total和used数据项,但是注意free却没有直接使用,而是计算成了disk_available_least:
<strong>disk_free_gb </strong>= disk_info_dict['free'] <strong>disk_over_committed </strong>= self.<strong>_get_disk_over_committed_size_total</strong>() <strong>available_least </strong>= <strong>disk_free_gb </strong>* units.Gi - <strong>disk_over_committed</strong> data['<strong>disk_available_least</strong>'] = available_least / units.Gi
可以看到,它从操作系统给的disk_free_gb 里面又减去了disk_over_committed的值。
我们来看看get_disk_over_committed_size_total是怎么获取的,这个函数也是libvirt driver的成员:
def _get_disk_over_committed_size_total(self): """Return total over committed disk size for all instances.""" # Disk size that all instance uses : virtual_size - disk_size disk_over_committed_size = 0 for dom in self._host.list_instance_domains(): try: xml = dom.XMLDesc(0) disk_infos = jsonutils.loads( self._get_instance_disk_info(dom.name(), xml)) for info in disk_infos: disk_over_committed_size += int( info['over_committed_disk_size']) except ……(此处略过) # NOTE(gtt116): give other tasks a chance. greenthread.sleep(0) return disk_over_committed_size
它是逐个获取每个instance的over_committed_disk_size,然后把它们累加起来。
意思是有的instance已经在超额使用磁盘了,那么超额在哪里呢?
对于每一个instance,是通过下面的函数获取over_committed_disk_size的:
def _get_instance_disk_info(self, instance_name, xml, block_device_info=None): block_device_mapping = driver.block_device_info_get_mapping( block_device_info) volume_devices = set() for vol in block_device_mapping: disk_dev = vol['mount_device'].rpartition("/")[2] volume_devices.add(disk_dev) disk_info = [] doc = etree.fromstring(xml) disk_nodes = doc.findall('.//devices/disk') path_nodes = doc.findall('.//devices/disk/source') driver_nodes = doc.findall('.//devices/disk/driver') target_nodes = doc.findall('.//devices/disk/target') for cnt, path_node in enumerate(path_nodes): disk_type = disk_nodes[cnt].get('type') path = path_node.get('file') or path_node.get('dev') target = target_nodes[cnt].attrib['dev'] if not path: LOG.debug('skipping disk for %s as it does not have a path', instance_name) continue if disk_type not in ['file', 'block']: LOG.debug('skipping disk because it looks like a volume', path) continue if target in volume_devices: LOG.debug('skipping disk %(path)s (%(target)s) as it is a ' 'volume', {'path': path, 'target': target}) continue # get the real disk size or # raise a localized error if image is unavailable<strong> if disk_type == 'file': dk_size = int(os.path.getsize(path)) elif disk_type == 'block': dk_size = lvm.get_volume_size(path) disk_type = driver_nodes[cnt].get('type') if disk_type == "qcow2": backing_file = libvirt_utils.get_disk_backing_file(path) virt_size = disk.get_disk_size(path) over_commit_size = int(virt_size) - dk_size else: backing_file = "" virt_size = dk_size over_commit_size = 0</strong> disk_info.append({'type': disk_type, 'path': path, 'virt_disk_size': virt_size, 'backing_file': backing_file, 'disk_size': dk_size, 'over_committed_disk_size': over_commit_size}) return jsonutils.dumps(disk_info)
举个例子,对于qcow2格式的镜像,这个overcommit size等于virt_size减去dk_size:
[root@host123 ~]# ll -h /var/lib/nova/instances/109291c0-0bf0-412c-9e87-6ab01e16bc06/disk
-rw-r--r-- 1 root root 5.0G Feb 25 11:41 /var/lib/nova/instances/109291c0-0bf0-412c-9e87-6ab01e16bc06/disk
-rw-r--r-- 1 root root 5.0G Feb 25 11:41 /var/lib/nova/instances/109291c0-0bf0-412c-9e87-6ab01e16bc06/disk
镜像文件实际大小dk_size是5.0G。我们再用qemu-img命令查看一下qcow2的详细信息:
[root@host123 ~]# qemu-img info /var/lib/nova/instances/109291c0-0bf0-412c-9e87-6ab01e16bc06/disk
image: /var/lib/nova/instances/109291c0-0bf0-412c-9e87-6ab01e16bc06/disk
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 4.9G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/afd631de55a9b7026775a4a1ada098a9ae6888c7
Format specific information:
compat: 0.10
image: /var/lib/nova/instances/109291c0-0bf0-412c-9e87-6ab01e16bc06/disk
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 4.9G
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/afd631de55a9b7026775a4a1ada098a9ae6888c7
Format specific information:
compat: 0.10
这里的virtual size减去disk size,便是over_commit_size。
可以看到,这里仅仅对qcow2格式的镜像做了overcommit处理,其它文件的over_commit_size等于0。
我们知道,在nova调度服务的DiskFilter里面,用到了disk_allocation_ratio对磁盘资源做了超分,它和这里的overcommit不是一个概念,它是从控制节点角度看到的超额使用,而计算节点看不到,overcommit是计算节点看到了磁盘qcow2压缩格式之后所得到的结果,它最终上报的剩余空间是扣除了假设qcow2镜像文件解压之后的实际结果。所以会遇到实际上报的剩余空间小于肉眼看到的空间大小。
如果管理员部署时指定了计算节点,则不走调度流程,就会把虚拟机硬塞给该计算节点,强行占用了已经归入超额分配计划的空间,则最终可能导致计算节点上报的磁盘资源为负数。并且将来随着虚拟机实际占用的磁盘空间越来越大,最终可能就导致计算节点硬盘空间不足了。
0 0
- 为什么nova计算节点上报的剩余磁盘空间为负数?
- 【openstack】【nova】【libvirt】在计算节点上实现的imagecache
- 为什么Dijkstra算法不适用边长为负数的情况
- 看 nova-scheduler 如何选择计算节点
- 获取 iPhone 剩余磁盘空间的代码
- 单节点安装openstack 之 计算节点nova 安装
- [部署篇5]VMWare搭建Openstack——计算节点的基础部署和Nova的安装
- 体积为负数的DP
- 计算最小的剩余数
- 获取iPhone 剩余磁盘空间-全部磁盘空间
- 为什么在库存明细表里,数量为0,但是金额不为0或者有负数的现象
- 为什么我的积分变成了负数
- 为什么我的积分是负数?
- 为什么int的负数绝对值大1
- 表达式的计算结果必须为节点集。
- 利用剩余磁盘空间新建分区
- openstack搭建--5--控制节点和计算节点安装配置nova
- C++负数的二进制表示的计算
- 大牛博士是如何进行文献检索和阅读的
- swiper滑动实现幻灯片功能及swiper animate的动画特效
- JAVA关键字 static,finally
- leetcode Best Time to Buy and Sell Stock II
- JMS介绍:我对JMS的理解和认识
- 为什么nova计算节点上报的剩余磁盘空间为负数?
- image等比缩小
- android开发官方教程之Building a Dynamic UI with Fragments
- USACO3.2--Magic Squares+经典搜索
- Ubuntu1204下安装FreeSurfer使用Qdec时报libXss.so.1的问题的解决办法
- UITextField 文本字段控件 -- IOS (解决键盘遮住View及密文設定的问题)(实例)
- 一些格式的工具类(非常全面哦!)
- Tomcat修改session持久化
- 二维物体形状识别方法