[知其然不知其所以然-30] smp wakeup nonboot CPUs

来源:互联网 发布:java输入输出流缓冲区 编辑:程序博客网 时间:2024/06/05 03:39

Essential the wake up of processor is mainly done by 

smp_init ->cpu_up, and the most important function is in

do_boot_cpu. As comment says, this function  wakes up a CPU in difference cases

Use the method in the APIC driver if it's defined,Otherwise,

Use an INIT boot APIC message for APs or NMI for BSP.

if (apic->wakeup_secondary_cpu)boot_error = apic->wakeup_secondary_cpu(apicid, start_ip);elseboot_error = wakeup_cpu_via_init_nmi(cpu, start_ip, apicid,     &cpu0_nmi_registered);

Here we mainly focus on the latter, thus wakeup_cpu_via_init_nmi.

This core function wake up AP by INIT, INIT, STARTUP sequence.

Executing  INIT, INIT, STARTUP sequence  will jump into the BIOS

boot-strap code which is the normal behavior of waking up

APs, but not a desirable behavior for waking up BSP. To

avoid the boot-strap code on CPU0, wake up CPU0 by NMI instead.

And nmi handler is installed by trap_init:

set_intr_gate_ist(X86_TRAP_NMI, &nmi, NMI_STACK);

and nmi is defined in 

arch/x86/kernel/entry_64.S:ENTRY(nmi)calldo_nmiEND(nmi)

actually nmi is only used for cpu0, so if cpu0 is woken up

by nmi, it will return to its eip where it is interrupted by nmi,

because the nmi_handler for wakeup cpu0 is do nothing, and

what really matter is what is the following code after returned from nmi:

boot_error = register_nmi_handler(NMI_LOCAL,  wakeup_cpu0_nmi, 0, "wake_cpu0");

for example, in mwait_play_dead, there is a judgement to check if it

is cpu0, then goes to special wake up procudure:

monitor()mwait()if(cpu0)  wakeup_cpu0();

OK, I'm not interrested in nmi now, let's check how nonboot CPUs are woken up.

let's check the wakeup_cpu_via_init, it mainly send STARTUP IPI

to APs(nonboot CPUs), when APs receive SIPI, it will first switch to realmode,

and jump to the address depicted by SIPI params, thus in our case the address

is a physical address, in arch/x86/kernel/smoboot.c : do_boot_cpu:

unsigned long start_ip = real_mode_header->trampoline_start;

initial_code = (unsigned long)start_secondary;

Actually, the real_mode_header is filled by script during kernel compiling,

it is in arch/x86/realmode/rm/header.S:

.section ".header", "a".balign16GLOBAL(real_mode_header).longpa_text_start.longpa_ro_end/* SMP trampoline */.longpa_trampoline_start.longpa_trampoline_status.longpa_trampoline_header

and trampoline_start is defined in arch/x86/realmode/rm/trampoline_64.S:

.text.code16.balignPAGE_SIZEENTRY(trampoline_start)cli# We should be safe anywaywbinvd        # Setup stack        # Enable protected mode        # Enable paging and in turn activate Long Mode.balign8GLOBAL(trampoline_header)tr_start:.space8}

So this function set up stack for this cpu, and enable protect mode(not listed

in above code, please refer to actual code) and finally jump to the address stored  in trampoline_header->tr_start(8bytes

address), so what is the content of the first 8 bytes in trampoline_header?


during setup_arch, we have:

void __init setup_real_mode(void){trampoline_header->start = (u64) secondary_startup_64;}
and according to the structure of trampoline_header:

struct trampoline_header {#ifdef CONFIG_X86_32u32 start;u16 gdt_pad;u16 gdt_limit;u32 gdt_base;#elseu64 start;u64 efer;u32 cr4;#endif};
it is secondary_startup_64.

At this point we Enable PAE mode and PGE, and Setup early boot stage 4 level pagetables,

as the boot cpu does, finally jump to initial_code, which is modified from 

x86_64_start_kernel to start_secondary in do_boot_cpu.







0 0
原创粉丝点击