Openstack liberty 云主机迁移源码分析之静态迁移1
来源:互联网 发布:国外科学网站知乎 编辑:程序博客网 时间:2024/05/18 10:54
虚拟机迁移使资源配置更加灵活,尤其是在线迁移,提供了虚拟机的可用性和可靠性。Openstack liberty中提供了两种类型的迁移实现:静态迁移(cold migration)和动态迁移(live migration)。在接下来的几篇文章中,我将详细分析两种迁移的实现过程,先来看静态迁移。
限于篇幅,静态迁移的源码分析将包含两篇文章:
- 第一篇:主要介绍迁移过程中
nova-api
及nova-conductor
所在的工作 - 第二篇:重点介绍
nova-compute
的处理过程
下面请看第一篇的内容:
发起迁移
用户可以手动通过nova CLI
命令行发起云主机迁移动作:
#nova --debug migrate 52e4d485-6ccf-47f3-a754-b62649e7b256
上述命令将id=52e4d485-6ccf-47f3-a754-b62649e7b256
的云主机迁移到另外一个最优的nova-compute
节点上,--debug
选项用来显示执行日志:
......curl -g -i -X POST http://controller:8774/v2/eab72784b36040a186a6b88dac9ac0b2/servers/5a7d302f-f388-4ffb-af37-f1e6964b3a51/action -H "User-Agent: python-novaclient" -H "Content-Type: application/json" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}8e294a111a5deaa45f6cb0f3c58a600d2b1b0493" -d '{"migrate": null}......
上述截取的日志表明:novaclient
通过http方式将迁移请求发送给nova-api
并执行migrate
动作(action),由nova-api
启动时建立的路由映射,很容易的知道,该动作的入口函数为 nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate
,下文具体分析。
源码分析
nova-api
部分
如上分析,迁移入口如下:
#nova/api/openstack/compute/migrate_server.py/MigrateServerController._migrate, 省略装饰器定义def _migrate(self, req, id, body): """Permit admins to migrate a server to a new host. req 是Request对象,包含该次请求信息 id 是待迁移的云主机id 如:52e4d485-6ccf-47f3-a754-b62649e7b256 body 是该次请求的参数信息 {"migrate": null} """ #从Request对象提取请求上下文 context = req.environ['nova.context'] """执行权限认证,默认会通过读取host节点/etc/nova/policy.json文件 中的权限规则完成认证,如果没有定义相关的规则,则表明认证失败抛抛异 这里对应的认证规则是: "os_compute_api:os_migrate_server:migrate": rule:admin_api" """ authorize(context, action='migrate') #从nova数据库中获取id指向的云主机信息,返回一个InstanceV2对象 instance = common.get_instance(self.compute_api, context, id) """省略异常处理代码 如果云主机不存在,找不到合适的目标主机,云主机处于锁定状态, 资源不足,云主机状态不对(只能是运行或者停止态)则抛异常 与‘调整云主机大小’(resize)操作一样,也是调用 `/nova/compute/api.py/API.resize` 执行迁移操作,resize是通过判断 是否指定了flavor_id参数来判断是执行‘调整云主机大小’还是‘迁移’操作, 请看下文的具体分析 """ self.compute_api.resize(req.environ['nova.context'], instance)---------------------------------------------------------------#接上文:/nova/compute/api.py/API.resize, 省略装饰器定义def resize(self, context, instance, flavor_id=None, clean_shutdown=True, **extra_instance_updates): """Resize (ie, migrate) a running instance. If flavor_id is None, the process is considered a migration, keeping the original flavor_id. If flavor_id is not None, the instance should be migrated to a new host and resized to the new flavor_id. 上面的注释是说:如果flavor_id = None, 则用原有的flavor(配置)执行 迁移操作。如果不为None,则应将云主机迁移到新的主机并应用flavor_id指 定的配置 conext 请求上下文 instance InstanceV2实例对象,包含云主机的详细配置信息 flavor_id 配置模板id,这里为None,因为是迁移操作 clean_shutdown = True, 静态迁移时开启关机重试,如果未能正常关闭云 主机会抛异常 """ #检查系统磁盘的‘自动配置磁盘’功能是否打开,否则抛异常 #迁移完成后,云主机需要能够自动配置系统磁盘 self._check_auto_disk_config(instance, **extra_instance_updates) #获取云主机配置模板信息 current_instance_type = instance.get_flavor() # If flavor_id is not provided, only migrate the instance. #flavor_id = None, 执行迁移操作;打印日志并将当前配置作为迁移后云主 #机的配置 if not flavor_id: LOG.debug("flavor_id is None. Assuming migration.", instance=instance) new_instance_type = current_instance_type else: #从nova.instance_types数据表获取flavor_id指定的配置模板信息 #read_deleted="no",表示读取数据库时过滤掉已经删除的配置模板 new_instance_type = flavors.get_flavor_by_flavor_id( flavor_id, read_deleted="no") #如果云主机是从镜像启动的并且当前的配置模板中root_gb(根磁盘大 #小)不为0,而目标配置模板中的root_gb=0,则不支持resize操作 #因为不知道怎么分配系统磁盘大小了,抛异常 if (new_instance_type.get('root_gb') == 0 and current_instance_type.get('root_gb') != 0 and not self.is_volume_backed_instance(context, instance)): reason = _('Resize to zero disk flavor is not' 'allowed.') raise exception.CannotResizeDisk(reason=reason) #如果没有找到指定的配置模板,抛异常 if not new_instance_type: raise exception.FlavorNotFound(flavor_id=flavor_id) #打印debug日志 current_instance_type_name = current_instance_type['name'] new_instance_type_name = new_instance_type['name'] LOG.debug("Old instance type %(current_instance_type_name)s, " " new instance type %(new_instance_type_name)s", {'current_instance_type_name': current_instance_type_name, 'new_instance_type_name': new_instance_type_name}, instance=instance) #判断是否是同一配置模板,迁移操作中肯定是同一配置模板 same_instance_type = (current_instance_type['id'] == new_instance_type['id']) """NOTE(sirp): We don't want to force a customer to change their flavor when Ops is migrating off of a failed host. """ #如果是resize操作,新的配置模板被disable了,抛异常 if not same_instance_type and new_instance_type.get('disabled'): raise exception.FlavorNotFound(flavor_id=flavor_id) #默认cell关闭,cell_type = None #这里是说resize的时候,新旧配置模板不能是相同的,因为这样做没有意义 if same_instance_type and flavor_id and self.cell_type != 'compute': raise exception.CannotResizeToSameFlavor() # ensure there is sufficient headroom for upsizes #如果是resize操作,需要先保留资源配额 if flavor_id: #获取vcpu和memory的增量配额(如果有的话,新旧配置模板的差值) deltas = compute_utils.upsize_quota_delta(context, new_instance_type, current_instance_type) try: #为当前用户和项目保留资源(增量)配额,更新数据库 quotas = compute_utils.reserve_quota_delta(context, deltas, instance) except exception.OverQuota as exc: #统计资源不足信息,并打印日志 quotas = exc.kwargs['quotas'] overs = exc.kwargs['overs'] usages = exc.kwargs['usages'] headroom = self._get_headroom(quotas, usages, deltas) (overs, reqs, total_alloweds, useds) = self._get_over_quota_detail(headroom, overs, quotas, deltas) LOG.warning(_LW("%(overs)s quota exceeded for %" "(pid)s, tried to resize instance."), {'overs': overs, 'pid': context.project_id}) raise exception.TooManyInstances(overs=overs, req=reqs, used=useds, allowed=total_alloweds) #迁移操作,没有额外的资源需要保留 else: quotas = objects.Quotas(context=context) #更新与主机状态:主机状态:重建/迁移,任务状态:准备重建或者迁移 instance.task_state = task_states.RESIZE_PREP instance.progress = 0 instance.update(extra_instance_updates) instance.save(expected_task_state=[None]) """为nova-scheduler生成过滤选项, CONF.allow_resize_to_same_host = true 表示允许迁移的目的主机与源主机相同,否则过滤掉源主机 """ filter_properties = {'ignore_hosts': []} if not CONF.allow_resize_to_same_host: filter_properties['ignore_hosts'].append(instance.host) #默认cell_type = None, if self.cell_type == 'api': # Commit reservations early and create migration record. self._resize_cells_support(context, quotas, instance, current_instance_type, new_instance_type) #flavor_id = None, 执行迁移操作,否则执行resize #记录实例操作,更新nova.instance_actions数据表,迁移结束后会更新数 #据库记录,反映迁移结果 if not flavor_id: self._record_action_start(context, instance, instance_actions.MIGRATE) else: self._record_action_start(context, instance, instance_actions.RESIZE) """将迁移请求转发给 `/nova/conductor/api.py/ComputeTaskAPI.resize_instance`,该 方法直接调用 `nova/conductor/rpcapi.py/ComputeTaskAPI.migrate_server`处理 请求,请看下文的分析 """ scheduler_hint = {'filter_properties': filter_properties} self.compute_task_api.resize_instance(context, instance, extra_instance_updates, scheduler_hint=scheduler_hint, flavor=new_instance_type, reservations=quotas.reservations or [], clean_shutdown=clean_shutdown)------------------------------------------------------------#接上文:`nova/conductor/rpcapi.py/ComputeTaskAPI.migrate_server`def migrate_server(self, context, instance, scheduler_hint, live, rebuild, flavor, block_migration, disk_over_commit, reservations=None, clean_shutdown=True): """输入参数如下: live = False, 静态迁移 rebuild = false, 迁移,而不是resize block_migration = None, 不是块迁移 disk_over_commit = None reservations = [] 迁移操作,没有增量保留资源 """ #生成请求参数字典 kw = {'instance': instance, 'scheduler_hint': scheduler_hint, 'live': live, 'rebuild': rebuild, 'flavor': flavor, 'block_migration': block_migration, 'disk_over_commit': disk_over_commit, 'reservations': reservations, 'clean_shutdown': clean_shutdown} #根据RPCClient的版本兼容性,选择客户端版本。 #在初始化rpc的时候会设置版本兼容特性 version = '1.11' if not self.client.can_send_version(version): del kw['clean_shutdown'] version = '1.10' if not self.client.can_send_version(version): kw['flavor'] = objects_base.obj_to_primitive(flavor) version = '1.6' if not self.client.can_send_version(version): kw['instance'] = jsonutils.to_primitive( objects_base.obj_to_primitive(instance)) version = '1.4' #通过同步rpc调用将`migrate_server`消息发送给rabbitmq, #消费者`nova-conductor`将会收到该消息 cctxt = self.client.prepare(version=version) return cctxt.call(context, 'migrate_server', **kw)
小结:nova-api
主要完成实例状态、相关条件检查, 之后更新云主机状态及添加nova.instance_actions
数据库记录,最后通过同步rpc将请求转发给nova-conductor
处理
nova-conductor
部分
由前述的分析,我们很容易就知道nova-conductor
处理迁移请求的入口:
#/nova/conductor/manager.py/ComputeTaskManager.migrate_serverdef migrate_server(self, context, instance, scheduler_hint, live, rebuild, flavor, block_migration, disk_over_commit, reservations=None, clean_shutdown=True): """各输入参数来自`nova-api`,如下: scheduler_hint 调度选项,{u'filter_properties': {u'ignore_hosts': []}} live = False, 静态迁移 rebuild = Flase, 迁移而不是调整云主机大小 block_migration = None, 非块迁移 disk_over_commit = None reservations = [] ,迁移操作没有增量保留资源 """ #如果输入的instance参数不是非法的NovaObject对象,就先从数据库获取 #云主机信息,然后生成InstanceV2对象 if instance and not isinstance(instance, nova_object.NovaObject): # NOTE(danms): Until v2 of the RPC API, we need to tolerate # old-world instance objects here attrs = ['metadata', 'system_metadata', 'info_cache', 'security_groups'] instance = objects.Instance._from_db_object( context, objects.Instance(), instance, expected_attrs=attrs) # NOTE: Remove this when we drop support for v1 of the RPC API #如果输入的flavor参数不是合法的Flavor对象,就先从数据库提取指定id #的配置模板,然后生成Flavor对象 if flavor and not isinstance(flavor, objects.Flavor): # Code downstream may expect extra_specs to be #populated since it is receiving an object, so lookup #the flavor to ensure this. flavor = objects.Flavor.get_by_id(context, flavor['id']) #动态迁移,在另外一篇文章中详述 if live and not rebuild and not flavor: self._live_migrate(context, instance, scheduler_hint, block_migration, disk_over_commit) #调用_cold_migrate执行静态迁移,下文具体分析 elif not live and not rebuild and flavor: instance_uuid = instance.uuid #with语句,在迁移前记录迁移事件记录到数据库 #(nova.instance_actions_events),迁移后更新数据库迁移记录 with compute_utils.EventReporter(context, 'cold_migrate', instance_uuid): self._cold_migrate(context, instance, flavor, scheduler_hint['filter_properties'], reservations, clean_shutdown) #未知类型 else: raise NotImplementedError()-------------------------------------------------------------#接上文:def _cold_migrate(self, context, instance, flavor, filter_properties, reservations, clean_shutdown): #从实例对象中获取所使用的镜像信息,示例如下: """ {u'min_disk': u'20', u'container_format': u'bare', u'min_ram': u'0', u'disk_format': u'raw', 'properties': {u'base_image_ref': u'e0cc468f-6501-4a85-9b19- 70e782861387'}} """ image = utils.get_image_from_system_metadata( instance.system_metadata) #通过镜像属性、云主机属性、云主机配置模板生成请求参数字典,格式如下: """ request_spec = { 'image': image, 'instance_properties': instance, 'instance_type': flavor, 'num_instances': 1} """ request_spec = scheduler_utils.build_request_spec( context, image, [instance], instance_type=flavor) #生成迁移任务对象 #`/nova/conductor/tasks/migrate.py/MigrationTask task = self._build_cold_migrate_task(context, instance, flavor, filter_properties, request_spec, reservations, clean_shutdown) """省略异常处理代码 如果未找到合适的目标主机,策略不合法等异常,则退出 在退出前会更新数据库,设置云主机的状态并打印日志及发送 `compute_task.migrate_server`通知 """ #执行迁移,下文具体分析 task.execute()---------------------------------------------------------------#接上文:`nova/conductor/tasks/migrate.py/MigrationTask._executedef _execute(self): #从请求参数中获取所使用的镜像信息 image = self.request_spec.get('image') #根据self.reservations保留配额生成配额对象, #迁移操作没有保留配额 self.reservations = [] self.quotas = objects.Quotas.from_reservations(self.context, self.reservations, instance=self.instance) #添加组(group_hosts)及组策略(group_polices)信息到过滤属性(如果有 #的话) scheduler_utils.setup_instance_group(self.context, self.request_spec, self.filter_properties) """添加重试参数到过滤属性(如果配置的重试次数 CONF.scheduler_max_attempts 〉1的话),修改后的过滤属性如下: {'retry': {'num_attempts': 1, 'hosts': []}, u'ignore_hosts': []} 如果是`nova-compute`发送过来的重试请求,输入的filter_properties过 滤属性中的retry字典中包含 前一次请求的异常信息,再次选择目标主机的时候会排除`hosts`中的主机,在 populate_retry过程中,会打印该条异常日志;如果重试超过了最大重试次 数,也会抛异常 """ scheduler_utils.populate_retry( self.filter_properties, self.instance.uuid) #发送请求给`nova-scheduler`,根据过滤规则选择合适的目标主机, #如果超时会根据前文的重试参数重试。如果成功,返回合适的目标主机列表 #如果找不到合适的目标主机,抛异常 hosts = self.scheduler_client.select_destinations( self.context, self.request_spec, self.filter_properties) #选取第一个 host_state = hosts[0] #添加目标主机到过滤属性的重试列表(重试的时候'hosts'中的主机被忽 略),示例如下: """ {'retry': {'num_attempts': 1, 'hosts': [[u'devstack', u'devstack']]}, 'limits': {u'memory_mb': 11733.0, u'disk_gb': 1182.0}, u'ignore_hosts': []} """ scheduler_utils.populate_filter_properties( self.filter_properties, host_state) # context is not serializable self.filter_properties.pop('context', None) #通过异步rpc调用发送`prep_resize`消息到消息队列,`nova-compute`会 #处理该请求(`nova/compute/rpcapi.py/ComputeAPI`) (host, node) = (host_state['host'], host_state['nodename']) self.compute_rpcapi.prep_resize( self.context, image, self.instance, self.flavor, host, self.reservations, request_spec=self.request_spec, filter_properties=self.filter_properties, node=node, clean_shutdown=self.clean_shutdown)
小结:nova-conductor
主要是借助nova-scheduler
选择合适的目标主机,同时也会更新nova.instance_actions_events
数据表,最后发起异步rpc调用将迁移请求转交给nova-compute
处理
到这里静态迁移的前篇就介绍完成了,过程还是比较简单的:主要完成一些条件判断,更新数据库记录,通过nova-scheduler
选主,最后将请求转交给nova-compute
处理。敬请期待:
Openstack liberty 云主机迁移源码分析之静态迁移2
- Openstack liberty 云主机迁移源码分析之静态迁移1
- Openstack liberty 云主机迁移源码分析之静态迁移2
- Openstack liberty 云主机迁移源码分析之在线迁移1
- Openstack liberty 云主机迁移源码分析之在线迁移2
- Openstack liberty 云主机迁移源码分析之在线迁移3
- Openstack liberty 云主机迁移源码分析之在线迁移4
- Openstack liberty源码分析 之 云主机的启动过程1
- Openstack liberty源码分析 之 云主机的启动过程2
- Openstack liberty源码分析 之 云主机的启动过程3
- 【OpenStack源码分析之八】openstack中虚拟机在线迁移
- Openstack liberty 创建实例快照源码分析1
- Openstack liberty 中Cinder-api启动过程源码分析1
- nova云主机迁移简要流程分析
- OpenStack 动态迁移流程分析
- Openstack liberty及mitaka中云主机快照实现对比分析
- Openstack迁移
- KVM 虚拟机在物理主机之间迁移的实现 -- 静态迁移/动态迁移/存储迁移
- 【nova】liberty版本openstack在线调整云主机大小
- ubuntu技巧备忘
- Hbase基本使用示例
- 递归解决汉诺塔问题
- ThinkPHP中initialize和construct的不同
- 资源分类博客
- Openstack liberty 云主机迁移源码分析之静态迁移1
- xtion pro live 单目视觉半直接法(SVO)实践
- MangoDB 实例
- android实现退出时关闭所有activity
- POJ 3264 Balanced Lineup(水题试ST算法模板)
- Codeforces 699A. Launch of Collider (模拟)
- Ext.QuickTips
- Centos + CUDA7.5 + caffe配置教程
- thinkphp中find()和select的区别