Openstack Nova: Resize /Cold Migration/Live Migration 流程分析
来源:互联网 发布:青果软件学院 编辑:程序博客网 时间:2024/05/21 08:52
Author: tianst
本博客只是没事闲扯蛋,加深记忆,对读者产生错误的引导概不负责!谢谢!
欢迎指正错误之处,一定及时修改。
虽然一直在学习Live Migration, 一直没有留意Resize的功能,在我的印象里面,这是个无用的功能,首先当有hypervisor支持ballon功能,磁盘热插拔,网卡热插拔,cpu热插拔==, 这个Resize还有什么意义,而且还是要关机重新创建。 我错了,不是所有的部署商都是使用IPSON类似的高大上产品。所以在openstack里面有了Resize,和 Cold Migration? 不知道了。反正这个Resize 让我很是受伤,被别人鄙视了一下。
我心中的云是这样的:
一个集群的VM的所有存储都是存放在一个公共的存储池里面
一个集群所有VM 的配置文件(libvirt.xml)是存放在一个服务器上的, 管理节点可以多这些文件修改
一个集群的计算节点为VM 提供CPU, Memory , networK ==。
一个集群内的VM可以自由热迁移,显然这种迁移是非常高效的,只需迭代copy内存就可以了
一个集群有一个IP资源池,迁移VM,不需要IP的更改。
一个集群内,VM重启和创建一样,会重新选在计算节点,没有特殊更改(更改了配置文件项)优先选择上次启动的计算节点。
一个。。。。
显然 现实。。。。。。。。
Resize and Cold Migration
在OpenStack的Nova中,两者的流程是一样的,唯一的区别有一点,Resize必须保持new flavor > old flavor.
当new flavor==old flavor的时候就是Clod Migration。
流程(使用novaclient):
1. novaclent : nova resize test 2 // 将VM 有flavor 1 resize 到resize 2
2. nova-api : 在servers.py 中 call _resize :
nova/api/openstack/compute/servers.py
1160 def _resize(self, req, instance_id, flavor_id, **kwargs): 1161 """Begin the resize process with given instance/flavor.""" 1162 context = req.environ["nova.context"] 1163 instance = self._get_server(context, req, instance_id) 1164 try: 1165 self.compute_api.resize(context, instance, flavor_id, **kwargs)call nova/compute/api.py 中的resize, 我将一些判断代码去掉了只留一下关键的
def resize(self, context, instance, flavor_id=None, **extra_instance_updates): """Resize (ie, migrate) a running instance. If flavor_id is None, the process is considered a migration, keeping the original flavor_id. If flavor_id is not None, the instance should be migrated to a new host and resized to the new flavor_id. """ self._check_auto_disk_config(instance, **extra_instance_updates) current_instance_type = flavors.extract_flavor(instance) # If flavor_id is not provided, only migrate the instance. if not flavor_id: # 没有flavor id 就认为是迁移 LOG.debug("flavor_id is None. Assuming migration.", instance=instance) new_instance_type = current_instance_type else: new_instance_type = flavors.get_flavor_by_flavor_id( flavor_id, read_deleted="no") if (new_instance_type.get('root_gb') == 0 and current_instance_type.get('root_gb') != 0): reason = _('Resize to zero disk flavor is not allowed.') raise exception.CannotResizeDisk(reason=reason) instance.task_state = task_states.RESIZE_PREP if not CONF.allow_resize_to_same_host: # 没有配置allow_resize_to_same_host就不允许迁移到原host。 filter_properties['ignore_hosts'].append(instance['host']) # Here when flavor_id is None, the process is considered as migrate. if (not flavor_id and not CONF.allow_migrate_to_same_host): filter_properties['ignore_hosts'].append(instance['host']) if not flavor_id: self._record_action_start(context, instance, instance_actions.MIGRATE) #更改VM状态 else: self._record_action_start(context, instance, instance_actions.RESIZE) self.compute_task_api.resize_instance(context, instance, # 注意这个地方不是RPC调用,只是python import conductor api extra_instance_updates, scheduler_hint=scheduler_hint, # 在nova/conductor/api.py 中会RPC调用conductor flavor=new_instance_type, reservations=quotas.reservations or [])
def resize_instance(self, context, instance, extra_instance_updates, scheduler_hint, flavor, reservations): # NOTE(comstud): 'extra_instance_updates' is not used here but is # needed for compatibility with the cells_rpcapi version of this # method. self.conductor_compute_rpcapi.migrate_server( # RPC 调用 nova/conductor/manager.py 中的 <span style="font-family: Arial, Helvetica, sans-serif;">migrate_server,</span> context, instance, scheduler_hint, False, False, flavor, None, None, reservations)
3. nova- conductor : 调用 migrate_server
nova/conductor/manager.py # 我将容错的code 删除掉,
def migrate_server(self, context, instance, scheduler_hint, live, rebuild, flavor, block_migration, disk_over_commit, reservations=None): if live and not rebuild and not flavor: self._live_migrate(context, instance, scheduler_hint, block_migration, disk_over_commit) elif not live and not rebuild and flavor: # Resize/Cloud Migration 传入的live 是false instance_uuid = instance['uuid'] with compute_utils.EventReporter(context, 'cold_migrate', instance_uuid): self._cold_migrate(context, instance, flavor, scheduler_hint['filter_properties'], reservations) def _cold_migrate(self, context, instance, flavor, filter_properties, reservations): try: scheduler_utils.populate_retry(filter_properties, instance['uuid']) hosts = self.scheduler_rpcapi.select_destinations( # RPC call 到scheduler 选择一个host context, request_spec, filter_properties) host_state = hosts[0] except exception.NoValidHost as ex: return try: self.compute_rpcapi.prep_resize( #RPC cast 调用目标host 的 compute的prep_resize context, image, instance, flavor, host, reservations, request_spec=request_spec, filter_properties=filter_properties, node=node) except Exception as ex: with excutils.save_and_reraise_exception(): updates = {'vm_state': instance['vm_state'], quotas.rollback()4 compute : 调用compute 的 prep_resize
nova/compute/manager.py:
def _prep_resize(self, context, image, instance, instance_type, quotas, request_spec, filter_properties, node): rt = self._get_resource_tracker(node) with rt.resize_claim(context, instance, instance_type, limits=limits) as claim: LOG.audit(_('Migrating'), context=context, instance=instance) self.compute_rpcapi.resize_instance( # 目标host上准备工作,RPC cast到源host resize instance 前的准备 context, instance, claim.migration, image, instance_type, quotas.reservations)
def resize_instance(self, context, instance, image, reservations, migration, instance_type): """Starts the migration of a running instance to another host.""" quotas = quotas_obj.Quotas.from_reservations(context, reservations, instance=instance) with self._error_out_instance_on_exception(context, instance, quotas=quotas): if not instance_type: instance_type = objects.Flavor.get_by_id( context, migration['new_instance_type_id']) network_info = self._get_instance_nw_info(context, instance) migration.status = 'migrating' migration.save(context.elevated()) instance.task_state = task_states.RESIZE_MIGRATING instance.save(expected_task_state=task_states.RESIZE_PREP) self.compute_rpcapi.finish_resize(context, instance, # 再到目标host上完成启动虚拟机 migration, image, disk_info, migration.dest_compute, reservations=quotas.reservations) self._notify_about_instance_usage(context, instance, "resize.end", network_info=network_info) self.instance_events.clear_events_for_instance(instance)
</pre><pre code_snippet_id="424618" snippet_file_name="blog_20140711_10_6016044" name="code" class="python"> def finish_resize(self, context, disk_info, image, instance, reservations, migration): try: self._finish_resize(context, instance, migration, disk_info, image)
def _finish_resize(self, context, instance, migration, disk_info, image): # NOTE(mriedem): If the original vm_state was STOPPED, we don't # automatically power on the instance after it's migrated power_on = old_vm_state != vm_states.STOPPED self.driver.finish_migration(context, migration, instance, #在目标host调用libvirt 完成 镜像的和资源的copy disk_info, network_info, image, resize_instance, block_device_info, power_on) migration.status = 'finished' migration.save(context.elevated()) instance.vm_state = vm_states.RESIZED instance.task_state = Nonenova/virt/libvirt/driver.py
def finish_migration(self, context, migration, instance, disk_info, network_info, image_meta, resize_instance, block_device_info=None, power_on=True): LOG.debug("Starting finish_migration", instance=instance) # resize disks. only "disk" and "disk.local" are necessary. disk_info = jsonutils.loads(disk_info) for info in disk_info: size = self._disk_size_from_instance(instance, info) self._disk_resize(info, size) disk_info = blockinfo.get_disk_info(CONF.libvirt.virt_type, instance, block_device_info, image_meta) # assume _create_image do nothing if a target file exists. self._create_image(context, instance, disk_mapping=disk_info['mapping'], network_info=network_info, block_device_info=None, inject_files=False) xml = self._get_guest_xml(context, instance, network_info, disk_info, block_device_info=block_device_info, write_to_disk=True) self._create_domain_and_network(context, xml, instance, network_info, # 这里完成虚拟机的创建 block_device_info, power_on) if power_on: # 等待power on timer = loopingcall.FixedIntervalLoopingCall( self._wait_for_running, instance) timer.start(interval=0.5).wait()好了,到此,虚拟机已经迁移过去,其实就是首先标记虚拟机状态为resize 或是migration,虚拟机处于不可用,然后copy 信息到目标host,在host 创建虚拟机。但是这时的两边的虚拟机都不可用。 你还需要 nova resize-confirm test 虚拟机确认, confirm 完成本地资源的删除,新VM的状态更改。这之后新虚拟机就可以使用了
虚拟机状态的变化:
我使用是本地resize,在/etc/nova/nova.conf 中allow_resize_to_same_host = True
看下图数据的变化,我使用devstack只有一个主机),
源host :
1. 关闭被迁移的instance,。
2. 会把/opt/stack/data/nova/instance下的 5b7dab63-314d-41dd-be22-8a2807d6743c MV 为5b7dab63-314d-41dd-be22-8a2807d6743c_resize, 然后以5b7dab63-314d-41dd-be22-8a2807d6743c_resize, 将以这个目录文件为基础进行镜像的迁移,同时在迁移完镜像后能够保留下来作为备份, 比如可以做不成功时的回滚。接下来, 在连接至目标主机建立新的目录9dd5d463-f9a9-4173-a3fb-91ce95ccac7b,如果共享存储, 可直接mkdir, 如果是非共享存储, 那么需要ssh到目标主机上建立这个目录。
附: 判断是否为共享存储的方法:
a.目标host是否就是本地host ,能不能ssh到目标host建立临时文件。 如果都不可以,为非共享存储
3. 开始做实质工作,转换格式(转为raw格式)和拷贝image(raw disk file)到新建的5b7dab63-314d-41dd-be22-8a2807d6743c目录
目标host:
flavor若有变化,也会在instance的属性中做修改,从而体现在最终的libvirt.xml上,disk的size也会通过qemu-img resize进行resize,配置好网络及其相关设备,创建instance, 保持UUID 不变。
在libvirt 的 utils.py 中 镜像的迁移动作,无非就是cp 或者是scp
def copy_image(src, dest, host=None): """Copy a disk image to an existing directory :param src: Source image :param dest: Destination path :param host: Remote host """ if not host: execute('cp', src, dest) else: dest = "%s:%s" % (host, dest) try: execute('rsync', '--sparse', '--compress', '--dry-run', src, dest) except processutils.ProcessExecutionError: execute('scp', src, dest) else: execute('rsync', '--sparse', '--compress', src, dest)
Live Migration
在conductor 服务之前和 Resize一样,Live Migration 是要保持VM处于运行, 如果镜像共享存储,只需要迭代copy内存,但是前提是一些条件要满足,比如源host的CPU特性集合 属于目标的子集 ==。
openstack 要求 Live Migration是知道目的host, 不用像Cold Migration 需要使用scheduler 选择一个host 。
由于热迁移 要求虚拟机业务不中断,所以一般都是在共享存储的条件下,这时影响热迁移的关键因素有两个,一个是VM内存脏页的速度,迭代copy是页为单位,二就是网络带宽。热迁移不是业务不中断,在虚拟机短暂挂起的时间内,需要快速完成最后一次内存copy。 hypervisor中挂起虚拟机本质上就是改变VCPU的调度,暂时不给VM 可用的物理cpu。给用户的感觉VM瞬间无响应。
很长时间没写过东西语言组织的不好,也不知道要写点啥。记录一下学习的过程吧,加深一下印象,省得以后被问起一时间想不起来被鄙视。
下午要回杭州,时间有限,后续重新开贴吧,东西太多了。
- Openstack Nova: Resize /Cold Migration/Live Migration 流程分析
- OpenStack Nova: Live Migration & Cold Migration & Resize
- OpenStack Nova : Live Migration 流程
- 【OpenStack】Nova中的migrate/resize/live-migration
- 【OpenStack】Nova中的migrate/resize/live-migration
- openstack nova live-migration
- openstack nova migration
- OpenStack live-migration 流程及配置
- OpenStack Live Migration
- 《转》OpenStack Live Migration
- openstack live-migration
- OpenStack future live-migration
- openstack的live migration 配置
- openstack的live migration 配置
- openstack live-migration配置过程
- openstack live-migration出错解决方法
- OpenStack Live Migration (转)
- OpenStack Austin Nova Design Summit - Migration
- 黑马程序员【深入理解IO流中字符编码问题】
- ios NSURLRequest NSMutableURLRequest 数据请求
- Python django web training——(二)选择一个IDE
- D-Bus在Windows下的创建步骤
- PopupWindow自适应不同大小的以及不同分辨率的屏幕
- Openstack Nova: Resize /Cold Migration/Live Migration 流程分析
- 奥术飞弹打死精灵龙的概率
- Win8.1 装机个人禁用服务选择
- hadoop出现namenode running as process 18472. Stop it first.
- VMware- Stop suspended vm
- C++ linked list
- Android SERVICE后台服务进程的守护
- Linux install CodeBlocks
- POJ3468 A Simple Problem with Integers 【线段树】+【成段更新】