关于qemu的二三事(6)————qemu源码分析之vcpu

来源:互联网 发布:h5 cms 开源 编辑:程序博客网 时间:2024/05/29 03:32


在前面的文章里面有说过,我在qemu的源码根目录建了个新路径专门来作为分析源码和debug之用。

好了,现在我们打开这个新路径:qemu/bin/debug/native

看过之前文章 关于qemu的二三事(4)————qemu源码的下载与编译,以及fdt  就知道,我再这个路径之下编译了qemu的源码。本来空空如也的文件夹,现在里面已经被填塞了一堆东西:

[root@localhost qemu]# ls bin/debug/native/
  chardev                  hmp.d           pc-bios              qemu-bridge-helper    qmp-commands.h    tests
accel.d            config-all-devices.mak   hmp.o           po                   qemu-bridge-helper.d  qmp.d             tpm.d
accel.o            config-all-disas.mak     hw              qapi                 qemu-bridge-helper.o  qmp-introspect.c  tpm.o
audio              config-host.h            io              qapi-event.c         qemu-ga               qmp-introspect.d  trace
backends           config-host.h-timestamp  iothread.d      qapi-event.d         qemu-img              qmp-introspect.h  trace-events-all
block              config-host.mak          iothread.o      qapi-event.h         qemu-img-cmds.h       qmp-introspect.o  trace-root.c
block.d            config.log               ivshmem-client  qapi-event.o         qemu-img.d            qmp-marshal.c     trace-root.c-timestamp
blockdev.d         config.status            ivshmem-server  qapi-generated       qemu-img.o            qmp-marshal.d     trace-root.d
blockdev-nbd.d     contrib                  libqemustub.a   qapi-types.c         qemu-io               qmp-marshal.o     trace-root.h
blockdev-nbd.o     cpus-common.d            libqemuutil.a   qapi-types.d         qemu-io-cmds.d        qmp.o             trace-root.h-timestamp
blockdev.o         cpus-common.o            linux-headers   qapi-types.h         qemu-io-cmds.o        qobject           trace-root.o
blockjob.d         crypto                   linux-user      qapi-types.o         qemu-io.d             qom               ui
blockjob.o         device-hotplug.d         Makefile        qapi-visit.c         qemu-io.o             replay            util
block.o            device-hotplug.o         migration       qapi-visit.d         qemu-nbd              replication.d     vl.d
     disas                    module_block.h  qapi-visit.h         qemu-nbd.d            replication.o     vl.o
bt-host.d          dma-helpers.d            nbd             qapi-visit.o         qemu-nbd.o            roms              x86_64-softmmu
bt-host.o          dma-helpers.o            net             qdev-monitor.d       qemu-options.def      slirp             x86_64-softmmu-config-devices.mak.d
bt-vhci.d          docs                     os-posix.d      qdev-monitor.o       qemu-version.h        stubs
bt-vhci.o          fsdev                    os-posix.o      qdict-test-data.txt  qga                   target
[root@localhost qemu]#

在这个里面的x86_64-softmmu之下的qemu-system-x86_64就是编译出来的可执行文件。好了,现在gdb搞起!

首先看到的就是vl.c文件里面的main函数,main函数是在2971行:

2968     return 0;2969 }29702971 int main(int argc, char **argv, char **envp)2972 {2973     int i;2974     int snapshot, linux_boot;2975     const char *initrd_filename;2976     const char *kernel_filename, *kernel_cmdline;2977     const char *boot_order = NULL;2978     const char *boot_once = NULL;2979     DisplayState *ds;2980     int cyls, heads, secs, translation;2981     QemuOpts *opts, *machine_opts;2982     QemuOpts *hda_opts = NULL, *icount_opts = NULL, *accel_opts = NULL;2983     QemuOptsList *olist;... ... 
gdb来调试首先要干的事是什么?

打断点啊!

断点打在哪里是门学问。合理的设置断点有助于提高程序调试的效率和速度,闲话少说,我们的第一个断点该设在哪里?

把vl.c里面main函数里面的内容大致过一遍,发现前面很大篇幅都是一些变量、数组、结构体的初始化、一些函数的注册,参数的解析,一直到4082行总算初步参数解析完了,这里只是初步,因为带有子选项的参数还没解析好,或者是说还没有做更进一步的处理,举个简单例子就是后面4201行的smp的参数解析和处理:

4201     smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));42024203     machine_class->max_cpus = machine_class->max_cpus ?: 1; /* Default to UP */4204     if (max_cpus > machine_class->max_cpus) {4205         error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "4206                      "supported by machine '%s' (%d)", max_cpus,4207                      machine_class->name, machine_class->max_cpus);4208         exit(1);

现在我们先不管这些,继续回到4082行往下看,下面都是一些虚拟机参数的初始化、设备和文件的注册、检查,完全没看到vcpu的create相关的东西嘛,不要急,慢慢来,过了4201的smp的检查继续往下看。下面好像也没有什么靠谱的东西,都是一些检查啊设置啊什么的,比如说display啊deamonize啊串口serious啊cdrom啊什么乱七八糟的,这都不是我们关注的重点,我们关注的重点还是vcpu相关的东西。

继续往下看,到4400行左右的时候,我们能看到一些关于guest OS的boot相关的代码:

4399     machine_opts = qemu_get_machine_opts();4400     kernel_filename = qemu_opt_get(machine_opts, "kernel");4401     initrd_filename = qemu_opt_get(machine_opts, "initrd");4402     kernel_cmdline = qemu_opt_get(machine_opts, "append");4403     bios_name = qemu_opt_get(machine_opts, "firmware");44044405     opts = qemu_opts_find(qemu_find_opts("boot-opts"), NULL);4406     if (opts) {4407         boot_order = qemu_opt_get(opts, "order");4408         if (boot_order) {4409             validate_bootdevices(boot_order, &error_fatal);4410         }... .... 
这说明离我们要找的东西已经不远了,继续。看到4587行附近发现guest OS的初始化基本完成,要开始创建了,后边几行就是一些硬件设备的初始化了:

4587     current_machine->ram_size = ram_size;4588     current_machine->maxram_size = maxram_size;4589     current_machine->ram_slots = ram_slots;4590     current_machine->boot_order = boot_order;4591     current_machine->cpu_model = cpu_model;45924593     machine_run_board_init(current_machine);45944595     realtime_init();4596... ... 
再到4701行的:

47004701     qdev_machine_creation_done();
基本可以断定vcpu的创建和初始化就在4593行的展开里面。这里打上断点。好了gdb正式搞起。

现在在gdb里面跑一个最简单的命令:

r  --enable-kvm -smp 2 -m 2048M -cpu host -hda /root/test/rhel7.qcow -monitor stdio
这时候直接C走到断点,进入断点:

4593     machine_run_board_init(current_machine);
进去之后是这样的:

736 void machine_run_board_init(MachineState *machine)737 {738     MachineClass *machine_class = MACHINE_GET_CLASS(machine);739740     if (nb_numa_nodes) {741         machine_numa_validate(machine);742     }743     machine_class->init(machine);744 }

然后我们继续进到
machine_class->init(machine);
它里面,看到:

(gdb) spc_init_v2_10 (machine=0x555556720000) at /root/qemu-2017-0531/qemu/hw/i386/pc_piix.c:449449     DEFINE_I440FX_MACHINE(v2_10, "pc-i440fx-2.10", NULL,(gdb) l444         m->alias = "pc";445         m->is_default = 1;446         m->numa_auto_assign_ram = numa_legacy_auto_assign_ram;447     }448449     DEFINE_I440FX_MACHINE(v2_10, "pc-i440fx-2.10", NULL,450                           pc_i440fx_2_10_machine_options);451
我们再去看看这个449行的宏是个什么东西:

 419 420 #define DEFINE_I440FX_MACHINE(suffix, name, compatfn, optionfn) \ 421     static void pc_init_##suffix(MachineState *machine) \ 422     { \ 423         void (*compat)(MachineState *m) = (compatfn); \ 424         if (compat) { \ 425             compat(machine); \ 426         } \ 427         pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \ 428                  TYPE_I440FX_PCI_DEVICE); \ 429     } \ 430     DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn) 431

显然最终执行的是pc_init1这个函数,那么再进入到它里面,可以看到在150行有个pc_cpus_init 的函数,继续进去,可以看到里面pc_cpus_init:

11371138 void pc_cpus_init(PCMachineState *pcms)1139 {1140     int i;1141     CPUClass *cc;1142     ObjectClass *oc;1143     const char *typename;1144     gchar **model_pieces;1145     const CPUArchIdList *possible_cpus;1146     MachineState *machine = MACHINE(pcms);1147     MachineClass *mc = MACHINE_GET_CLASS(pcms);11481149     /* init CPUs */1150     if (machine->cpu_model == NULL) {1151 #ifdef TARGET_X86_641152         machine->cpu_model = "qemu64";... ... ......1182     possible_cpus = mc->possible_cpu_arch_ids(machine);1183     for (i = 0; i < smp_cpus; i++) {1184         pc_new_cpu(typename, possible_cpus->cpus[i].arch_id, &error_fatal);1185     }

我们想找的东西,他就在这个函数pc_new_cpu里。在此之前的都是一些关于vcpu的参数配置啊类型啊什么的,gdb进去1184行这里,我们可以看到:

pc_new_cpu (typename=0x555556699690 "qemu64-x86_64-cpu", apic_id=0, errp=0x555556683790 <error_fatal>) at /root/qemu-2017-0531/qemu/hw/i386/pc.c:1097

这个pc_new_cpu展开之后是这样的:、

1096    static void pc_new_cpu(const char *typename, int64_t apic_id, Error **errp)1097    {1098        Object *cpu = NULL;1099        Error *local_err = NULL;11001101        cpu = object_new(typename);(gdb)11021103        object_property_set_int(cpu, apic_id, "apic-id", &local_err);1104        object_property_set_bool(cpu, true, "realized", &local_err);11051106        object_unref(cpu);1107        error_propagate(errp, local_err);1108    }1109
继续gdb单步,我们发现1103行执行之后没啥变化,但是1104行执行之后,会有新的线程产生,考虑到qemu本身就是一个userspace的程序,与kvm的交互实际上是通过接口kvm_ioctrl来读写/dev/kvm来实现的,那么qemu启动的虚拟机实际上就是一个进程,而vcpu则是这个进程下面的子线程。

那么,我们有理由认为,vcpu的创建与初始化是在第1104行完成的。继续gdb进去,

object_property_set_bool (obj=0x555556779580, value=true, name=0x555555c453d0 "realized", errp=0x7fffffffdd68) at /root/qemu-2017-0531/qemu/qom/object.c:11621162        QBool *qbool = qbool_from_bool(value);11581159    void object_property_set_bool(Object *obj, bool value,1160                                  const char *name, Error **errp)1161    {1162        QBool *qbool = qbool_from_bool(value);1163        object_property_set_qobject(obj, QOBJECT(qbool), name, errp);11641165        QDECREF(qbool);1166    }
继续gdb进入1163,

object_property_set_qobject (obj=0x555556779580, value=0x555556794a10, name=0x555555c453d0 "realized", errp=0x7fffffffdd68)    at /root/qemu-2017-0531/qemu/qom/qom-qobject.c:2626          v = qobject_input_visitor_new(value);(gdb) l21      void object_property_set_qobject(Object *obj, QObject *value,22                                       const char *name, Error **errp)23      {24          Visitor *v;2526          v = qobject_input_visitor_new(value);27          object_property_set(obj, v, name, errp);28          visit_free(v);29      }
继续到第27,

object_property_set (obj=0x555556779580, v=0x555556796150, name=0x555555c453d0 "realized", errp=0x7fffffffdd68) at /root/qemu-2017-0531/qemu/qom/object.c:10861086        ObjectProperty *prop = object_property_find(obj, name, errp);1083    void object_property_set(Object *obj, Visitor *v, const char *name,1084                             Error **errp)1085    {1086        ObjectProperty *prop = object_property_find(obj, name, errp);1087        if (prop == NULL) {1088            return;1089        }1090(gdb)1091        if (!prop->set) {1092            error_setg(errp, QERR_PERMISSION_DENIED);1093        } else {1094            prop->set(obj, v, name, prop->opaque, errp);1095        }1096    }
然后再到1094,

property_set_bool (obj=0x555556779580, v=0x555556796150, name=0x555555c453d0 "realized", opaque=0x55555673e240, errp=0x7fffffffdd68)    at /root/qemu-2017-0531/qemu/qom/object.c:18491849    {(gdb) l1844        visit_type_bool(v, name, &value, errp);1845    }18461847    static void property_set_bool(Object *obj, Visitor *v, const char *name,1848                                  void *opaque, Error **errp)1849    {1850        BoolProperty *prop = opaque;1851        bool value;1852        Error *local_err = NULL;1853(gdb)1854        visit_type_bool(v, name, &value, &local_err);1855        if (local_err) {1856            error_propagate(errp, local_err);1857            return;1858        }18591860        prop->set(obj, value, errp);1861    }
到1860,

device_set_realized (obj=0x555556779580, value=true, errp=0x7fffffffdd68) at /root/qemu-2017-0531/qemu/hw/core/qdev.c:879879     {(gdb) l874875         return true;876     }877878     static void device_set_realized(Object *obj, bool value, Error **errp)879     {880         DeviceState *dev = DEVICE(obj);881         DeviceClass *dc = DEVICE_GET_CLASS(dev);882         HotplugHandler *hotplug_ctrl;883         BusState *bus;(gdb)
一直往下走,到917行,

915916             if (dc->realize) {917                 dc->realize(dev, &local_err);918             }919920             if (local_err != NULL) {
进去917,

x86_cpu_realizefn (dev=0x555556779580, errp=0x7fffffffdbb0) at /root/qemu-2017-0531/qemu/target/i386/cpu.c:34873487    {(gdb) l3482                               (env)->cpuid_vendor3 == CPUID_VENDOR_INTEL_3)3483    #define IS_AMD_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_AMD_1 && \3484                             (env)->cpuid_vendor2 == CPUID_VENDOR_AMD_2 && \3485                             (env)->cpuid_vendor3 == CPUID_VENDOR_AMD_3)3486    static void x86_cpu_realizefn(DeviceState *dev, Error **errp)3487    {3488        CPUState *cs = CPU(dev);3489        X86CPU *cpu = X86_CPU(dev);3490        X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);3491        CPUX86State *env = &cpu->env;(
这时候我们看到一个函数,x86_cpu_realizefn,在这个函数的展开里面,第3648行,这里,qemu如何创建vcpu终于露出真容了,

3648        qemu_init_vcpu(cs); (gdb) sqemu_init_vcpu (cpu=0x555556779580) at /root/qemu-2017-0531/qemu/cpus.c:17501750        cpu->nr_cores = smp_cores;1748    void qemu_init_vcpu(CPUState *cpu)1749    {1750        cpu->nr_cores = smp_cores;1751        cpu->nr_threads = smp_threads;1752        cpu->stopped = true;17531754        if (!cpu->as) {(gdb)1755            /* If the target cpu hasn't set up any address spaces itself,1756             * give it the default one.1757             */1758            AddressSpace *as = address_space_init_shareable(cpu->memory,1759                                                            "cpu-memory");1760            cpu->num_ases = 1;1761            cpu_address_space_init(cpu, as, 0);1762        }17631764        if (kvm_enabled()) {(gdb)1765            qemu_kvm_start_vcpu(cpu);1766        } else if (hax_enabled()) {1767            qemu_hax_start_vcpu(cpu);1768        } else if (tcg_enabled()) {1769            qemu_tcg_init_vcpu(cpu);1770        } else {1771            qemu_dummy_start_vcpu(cpu);1772        }1773    } 
第1764行开始,就是vcpu的创建过程,在enablekvm的情况下,调用1765行的qemu_kvm_start_vcpu,那么我们来看一下这个函数:

qemu_kvm_start_vcpu (cpu=0x555556779580) at /root/qemu-2017-0531/qemu/cpus.c:171717151716    static void qemu_kvm_start_vcpu(CPUState *cpu)1717    {1718        char thread_name[VCPU_THREAD_NAME_SIZE];17191720        cpu->thread = g_malloc0(sizeof(QemuThread));1721        cpu->halt_cond = g_malloc0(sizeof(QemuCond));(gdb)1722        qemu_cond_init(cpu->halt_cond);1723        snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",1724                 cpu->cpu_index);1725        qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,1726                           cpu, QEMU_THREAD_JOINABLE);1727        while (!cpu->created) {1728            qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);1729        }1730    }
喏,现在看清楚了吧,vcpu就是个线程,1725的qemu_thread_create我们再进去看看:

qemu_thread_create (thread=0x5555567a2210, name=0x7fffffffdaa0 "CPU 0/KVM", start_routine=0x555555791756 <qemu_kvm_cpu_thread_fn>, arg=0x555556779580, mode=0)    at /root/qemu-2017-0531/qemu/util/qemu-thread-posix.c:468465     void qemu_thread_create(QemuThread *thread, const char *name,466                            void *(*start_routine)(void*),467                            void *arg, int mode)468     {469         sigset_t set, oldset;470         int err;471         pthread_attr_t attr;472473         err = pthread_attr_init(&attr);474         if (err) {475             error_exit(err, __func__);476         }477478         /* Leave signal handling to the iothread.  */479         sigfillset(&set);480         pthread_sigmask(SIG_SETMASK, &set, &oldset);481         err = pthread_create(&thread->thread, &attr, start_routine, arg);482         if (err)(gdb)483             error_exit(err, __func__);484485         if (name_threads) {486             qemu_thread_set_name(thread, name);487         }488489         if (mode == QEMU_THREAD_DETACHED) {490             err = pthread_detach(thread->thread);491             if (err) {492                 error_exit(err, __func__);(gdb)493             }494         }495         pthread_sigmask(SIG_SETMASK, &oldset, NULL);496497         pthread_attr_destroy(&attr);498     }
然后我们再来看一下这个qemu_kvm_cpu_thread_fn,在它里面的kvm_init_vcpu才是在enablekvm情况下最终来由kvm来完成的部分:

1092 static void *qemu_kvm_cpu_thread_fn(void *arg)1093 {1094     CPUState *cpu = arg;1095     int r;10961097     rcu_register_thread();10981099     qemu_mutex_lock_iothread();1100     qemu_thread_get_self(cpu->thread);1101     cpu->thread_id = qemu_get_thread_id();1102     cpu->can_do_io = 1;1103     current_cpu = cpu;11041105     r = kvm_init_vcpu(cpu);1106     if (r < 0) {1107         fprintf(stderr, "kvm_init_vcpu failed: %s\n", strerror(-r));1108         exit(1);1109     }11101111     kvm_init_cpu_signals(cpu);
这里就不对kvm_init_vcpu来多做展开了。

然后我们让程序执行到底,发现:

[New Thread 0x7fffeffff700 (LWP 16755)]Continuing.[New Thread 0x7fffeffff700 (LWP 16756)][New Thread 0x7fffee1ff700 (LWP 16758)][New Thread 0x7fffed9fe700 (LWP 16759)]VNC server running on ::1:5900(qemu) info cpus* CPU #0: pc=0x00000000000082ea thread_id=16555 CPU #1: pc=0x00000000000fd406 (halted) thread_id=16756(qemu) [Thread 0x7fffee1ff700 (LWP 16758) exited]

说明我们创建的两个vcpu线程号分别是16555和16756,然后我们用pstree来检查一下:

[root@localhost ~]# ps -ef | grep qemuroot     13695 13557  0 Jun07 pts/1    00:00:10 gdb x86_64-softmmu/qemu-system-x86_64root     15616 13695  0 00:31 pts/1    00:00:16 /root/qemu/bin/debug/native/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 2 -m 2048M -hda /root/test/rhel7_cpu2006.qcow -monitor stdioroot     16779 14422  0 01:25 pts/5    00:00:00 grep --color=auto qemu[root@localhost ~]# pstree -p 15616qemu-system-x86(15616)─┬─{qemu-system-x86}(15617)  
                        ├─{qemu-system-x86}(16555)                        ├─{qemu-system-x86}(16756)                        └─{qemu-system-x86}(16759)


这些基本上能说明vcpu的性质了,在host看来,线程,线程,还是线程。而且是用户空间的线程。

总的来说,qemu在启动虚拟机的时候,创建vcpu的流程如下:
main(...) ==>machine_run_board_init(current_machine) ==> pc_init(...) ==> pc_init1(...) ==> pc_cpus_init(...) ==> pc_new_cpu(...)==> object_property_set_bool(...) ==> object_property_set_bool(...) ==> object_property_set(...) ==> property_set_bool ==> device_set_realized==> x86_cpu_realizefn ==> qemu_init_vcpu ==> qemu_kvm_start_vcpu ==> qemu_thread_create ==> qemu_kvm_cpu_thread_fn ==> kvm_init_vcpu

实际上,上面第二行的真正的代码实现应该是这样的,类似于C++的构造函数:
==>type_init(x86_cpu_register_types)==>x86_cpu_register_types(void)==>  type_register_static(&x86_cpu_type_info);==>  static const TypeInfo x86_cpu_type_info = {}==>   .class_init = x86_cpu_common_class_init,==>   x86_cpu_common_class_init(ObjectClass *oc, void *data)==>  dc->realize = x86_cpu_realizefn;==>  x86_cpu_realizefn(DeviceState *dev, Error **errp)


仔细看源码会发现,qemu这帮人硬生生的用C语言实现了许多个类,还有他们的构造函数还有一堆模板什么的,我想说的是,你好好的用C++不好吗?

费劲巴拉的绕了一大圈,代码看的别扭死了,后边如果有时间,写一写kvm是如何实现vcpu的吧。



原创粉丝点击