ice版本resize 错误调试（Host key verification failed）

来源：互联网发布：24u网络机柜编辑：程序博客网时间：2024/05/29 16:07

感谢朋友支持本博客，欢迎共同探讨交流，由于能力和时间有限，错误之处在所难免，欢迎指正！

如有转载，请保留源作者博客信息。

Better Me的博客：blog.csdn.net/tantexian

如需交流，欢迎大家博客留言。

117为控制节点和计算节点共用节点上：

nova --debug resize fefe2ba2-69dc-46dc-b337-da2788d94d494

117上的compute日志报错：

vim /var/log/nova/compute.log

[instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] Setting instance vm_state to ERROR

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] Traceback (most recent call last):

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 5531, in _error_out_instance_on_exception

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] yield

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3428, in resize_instance

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] block_device_info)

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 5059, in migrate_disk_and_power_off

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] utils.execute('ssh', dest, 'mkdir', '-p', inst_base)#此处命令需要无密钥登录

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] File "/usr/lib/python2.6/site-packages/nova/utils.py", line 164, in execute

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] return processutils.execute(*cmd, **kwargs)

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] File "/usr/lib/python2.6/site-packages/nova/openstack/common/processutils.py", line 193, in execute

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] cmd=' '.join(cmd))

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] ProcessExecutionError: Unexpected error while running command.

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] Command: ssh 192.168.10.114 mkdir -p /var/lib/nova/instances/b4d33c9d-c8b1-49e4-9f50-91f845d4115f

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] Exit code: 255

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] Stdout: ''

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f] Stderr: 'Host key verification failed.\r\n'

2015-04-22 16:28:22.435 3802 TRACE nova.compute.manager [instance: b4d33c9d-c8b1-49e4-9f50-91f845d4115f]

2015-04-22 16:28:22.818 3802 ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling: Unexpected error while running command.

Command: ssh 192.168.10.114 mkdir -p /var/lib/nova/instances/b4d33c9d-c8b1-49e4-9f50-91f845d4115f

Exit code: 255

Stdout: ''

Stderr: 'Host key verification failed.\r\n'

2015-04-22 16:28:22.818 3802 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):

上述错误说明117上用nova用户执行下述命令有错误：

ssh 192.168.10.114 mkdir -p /var/lib/nova/instances/b4d33c9d-c8b1-49e4-9f50-91f845d4115f

看一下117的用户文件：

vim /etc/passwd

其中

nova:x:162:162:OpenStack Nova Daemons:/var/lib/nova:/sbin/nologin

上述信息具体解释请自行查找linuxpasswd相关资料。

此处将nova修改为能够登录的用户：

nova:x:162:162:OpenStack Nova Daemons:/var/lib/nova:/bin/bash

ssh-keygen -t rsa

然后再将生成的文件scp到114节点：

scp /var/lib/nova/.ssh/id_rsa.pub root@192.168.10.114:/var/lib/nova/.ssh/authorized_keys

如果报对端没有/var/lib/nova/.ssh/文件，请用nova用户给该机器创建该文件夹

然后再nova下面执行：ssh 192.168.10.114

报错说当前账户不可用。

登录到114上面查看：

发现nova用户被禁止登录了。

打开：

再次到117上执行发现能登录了：

注意.ssh的所属组和用户必须为nova:nova，否则无密码登录会失败

再次来验证resize：

成功！

假如需要做成无密钥登录的自动脚本可以参考如下：

vim auto_ssh.sh

#!/usr/bin/expect

set timeout 10

set username [lindex $argv 0]

set password [lindex $argv 1]

set hostname [lindex $argv 2]

spawn ssh-copy-id -i /root/.ssh/id_rsa.pub $username@$hostname

expect {

#first connect, no public key in ~/.ssh/known_hosts

"Are you sure you want to continue connecting (yes/no)?" {

send "yes\r"

expect "password:"

send "$password\r"

}

#already has public key in ~/.ssh/known_hosts

"password:" {

send "$password\r"

}

"Now try logging into the machine" {

#it has authorized, do nothing!

}

expect eof

chmod 777 auto_ssh.sh

然后执行下述命令即可。

./auto_ssh.sh root 123456 192.168.10.162

试验结果：

162机器上传看：

成功！

注意在/etc/nova/nova.conf

中有两个与resize相关的配置项：

上述表示在resize之后如果N秒之内不确认resize则自动resize！

选择true则只能resize到本机

# Allow destination machine to match source for resize. Useful

# when testing in single-host environments. (boolean value)

#allow_resize_to_same_host=false

下面测试resize和确认resize功能：

测试通过。

测试resize之后，回滚resize操作：

revert_resize之后rbd磁盘文件找不到bug修复

</features>

</clock>

<on_poweroff>destroy</on_poweroff>

<on_reboot>restart</on_reboot>

<on_crash>destroy</on_crash>

<emulator>/usr/bin/qemu-system-x86_64</emulator>

</auth>

</source>

</disk>

跟踪代码发现：

vim /usr/lib/python2.6/site-packages/nova/compute/manager.py

@wrap_exception()

@reverts_task_state

@wrap_instance_event

@wrap_instance_fault

def revert_resize(self, context, instance, migration, reservations):

"""Destroys the new instance on the destination machine.

Reverts the model changes, and powers on the old instance on the

source machine.

"""

quotas = quotas_obj.Quotas.from_reservations(context,

reservations,

instance=instance)

# NOTE(comstud): A revert_resize is essentially a resize back to

# the old size, so we need to send a usage event here.

self.conductor_api.notify_usage_exists(

context, instance, current_period=True)

with self._error_out_instance_on_exception(context, instance,

quotas=quotas):

# NOTE(tr3buchet): tear down networks on destination host

self.network_api.setup_networks_on_host(context, instance,

teardown=True)

instance_p = obj_base.obj_to_primitive(instance)

migration_p = obj_base.obj_to_primitive(migration)

self.network_api.migrate_instance_start(context,

instance_p,

migration_p)

network_info = self._get_instance_nw_info(context, instance)

bdms = objects.BlockDeviceMappingList.get_by_instance_uuid(

context, instance.uuid)

block_device_info = self._get_instance_block_device_info(

context, instance, bdms=bdms)

self.driver.destroy(context, instance, network_info,

block_device_info) #此处将开始有A resize到B的虚拟机直接删除了，也就将共享的rbd磁盘文件删除了。因此回退resize会报找不到rbd文件异常

self._terminate_volume_connections(context, instance, bdms)

migration.status = 'reverted'

migration.save(context.elevated())

rt = self._get_resource_tracker(instance.node)

rt.drop_resize_claim(context, instance)

self.compute_rpcapi.finish_revert_resize(context, instance,

migration, migration.source_compute,

quotas.reservations)

下面附上从 revert_resize wsgi发布router 入口一直往底层根据代码：

解决方案为当后端为rbd共享存储时候，不上次rbd镜像文件。

vim /usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py:1070

if destroy_disks:

self._delete_instance_files(instance)

self._cleanup_lvm(instance)

#NOTE(haomai): destroy volumes if needed

if CONF.libvirt.images_type == 'rbd':

#edit by ttx do'nt delete rbd image's file when revert_resize

#self._cleanup_rbd(instance)

继续测试resize与回滚resize：

上图时刻虚拟机状态：

查看回滚后的虚拟机是否正常：（其实回滚过程中也是需要重启虚拟机的）

resize、resize确认、resize回滚三个功能修改bug完成，测试完毕。

0 0