Android6.0中ART执行类方法的过程分析二

来源:互联网 发布:易编程模块 编辑:程序博客网 时间:2024/05/29 15:11

在Android运行时ART加载类和方法的过程分析一文中,我们通过AndroidRuntime类的成员函数start来分析类和类方法的加载过程。本文同样是从这个函数开始分析类方法的执行过程,如下所示:

//frameworks/base/core/jni/AndroidRuntime.cpp1007 void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)1008 {    ...1079     /*1080      * Start VM.  This thread becomes the main thread of the VM, and will1081      * not return until the VM exits.1082      */1083     char* slashClassName = toSlashClassName(className);1084     jclass startClass = env->FindClass(slashClassName);1085     if (startClass == NULL) {1086         ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);1087         /* keep going */1088     } else {1089         jmethodID startMeth = env->GetStaticMethodID(startClass, "main",1090             "([Ljava/lang/String;)V");1091         if (startMeth == NULL) {1092             ALOGE("JavaVM unable to find main() in '%s'\n", className);1093             /* keep going */1094         } else {1095             env->CallStaticVoidMethod(startClass, startMeth, strArray);1096 1097 #if 01098             if (env->ExceptionCheck())1099                 threadExitUncaughtException(env);1100 #endif1101         }1102     }    ...1110 }

找到要调用类方法之后,就可以调用JNI接口CallStaticVoidMethod来执行它了。
根据我们在Android运行时ART加载类和方法的过程分析一文的分析可以知道,JNI接口CallStaticVoidMethod由JNI类的成员函数CallStaticVoidMethod实现,如下所示:

// art/runtime/jni_internal.cc1612   static void CallStaticVoidMethod(JNIEnv* env, jclass, jmethodID mid, ...) {1613     va_list ap;1614     va_start(ap, mid);1615     CHECK_NON_NULL_ARGUMENT_RETURN_VOID(mid);1616     ScopedObjectAccess soa(env);1617     InvokeWithVarArgs(soa, nullptr, mid, ap);1618     va_end(ap);1619   }

JNI类的成员函数CallStaticVoidMethod实际上又是通过全局函数InvokeWithVarArgs来调用参数mid指定的方法的,如下所示:

// ~/android-6.0.1_r62/art/runtime/reflection.cc437 JValue InvokeWithVarArgs(const ScopedObjectAccessAlreadyRunnable& soa, jobject obj, jmethodID mid,438                          va_list args)439     SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {440   // We want to make sure that the stack is not within a small distance from the441   // protected region in case we are calling into a leaf function whose stack442   // check has been elided.443   if (UNLIKELY(__builtin_frame_address(0) < soa.Self()->GetStackEnd())) {444     ThrowStackOverflowError(soa.Self());445     return JValue();446   }447 448   ArtMethod* method = soa.DecodeMethod(mid);449   bool is_string_init = method->GetDeclaringClass()->IsStringClass() && method->IsConstructor();450   if (is_string_init) {451     // Replace calls to String.<init> with equivalent StringFactory call.452     method = soa.DecodeMethod(WellKnownClasses::StringInitToStringFactoryMethodID(mid));453   }454   mirror::Object* receiver = method->IsStatic() ? nullptr : soa.Decode<mirror::Object*>(obj);455   uint32_t shorty_len = 0;456   const char* shorty = method->GetInterfaceMethodIfProxy(sizeof(void*))->GetShorty(&shorty_len);457   JValue result;458   ArgArray arg_array(shorty, shorty_len);459   arg_array.BuildArgArrayFromVarArgs(soa, receiver, args);460   InvokeWithArgArray(soa, method, &arg_array, &result, shorty);461   if (is_string_init) {462     // For string init, remap original receiver to StringFactory result.463     UpdateReference(soa.Self(), obj, result.GetL());464   }465   return result;466 }

函数InvokeWithVarArgs将调用参数封装在一个数组中,然后再调用另外一个函数InvokeWithArgArray来参数mid指定的方法。参数mid实际上是一个ArtMethod对象指针,因此,我们可以将它转换为一个ArtMethod指针,于是就可以得到被调用类方法的相关信息了。
函数InvokeWithArgArray的实现如下所示:

// ~/android-6.0.1_r62/art/runtime/reflection.cc426 static void InvokeWithArgArray(const ScopedObjectAccessAlreadyRunnable& soa,427                                ArtMethod* method, ArgArray* arg_array, JValue* result,428                                const char* shorty)429     SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {430   uint32_t* args = arg_array->GetArray();431   if (UNLIKELY(soa.Env()->check_jni)) {432     CheckMethodArguments(soa.Vm(), method->GetInterfaceMethodIfProxy(sizeof(void*)), args);433   }434   method->Invoke(soa.Self(), args, arg_array->GetNumBytes(), result, shorty);435 }

函数InvokeWithArgArray通过ArtMethod类的成员函数Invoke来调用参数method指定的类方法。
ArtMethod类的成员函数Invoke的实现如下所示

368 void ArtMethod::Invoke(Thread* self, uint32_t* args, uint32_t args_size, JValue* result,369                        const char* shorty) {    ...381   // Push a transition back into managed code onto the linked list in thread.382   ManagedStack fragment;383   self->PushManagedStackFragment(&fragment);384 385   Runtime* runtime = Runtime::Current();386   // Call the invoke stub, passing everything as arguments.387   // If the runtime is not yet started or it is required by the debugger, then perform the388   // Invocation by the interpreter.389   if (UNLIKELY(!runtime->IsStarted() || Dbg::IsForcedInterpreterNeededForCalling(self, this))) {390     if (IsStatic()) {391       art::interpreter::EnterInterpreterFromInvoke(self, this, nullptr, args, result);392     } else {393       mirror::Object* receiver =394           reinterpret_cast<StackReference<mirror::Object>*>(&args[0])->AsMirrorPtr();395       art::interpreter::EnterInterpreterFromInvoke(self, this, receiver, args + 1, result);396     }397   } else {398     DCHECK_EQ(runtime->GetClassLinker()->GetImagePointerSize(), sizeof(void*));399 400     constexpr bool kLogInvocationStartAndReturn = false;401     bool have_quick_code = GetEntryPointFromQuickCompiledCode() != nullptr;402     if (LIKELY(have_quick_code)) {403       if (kLogInvocationStartAndReturn) {404         LOG(INFO) << StringPrintf(405             "Invoking '%s' quick code=%p static=%d", PrettyMethod(this).c_str(),406             GetEntryPointFromQuickCompiledCode(), static_cast<int>(IsStatic() ? 1 : 0));407       }408 409       // Ensure that we won't be accidentally calling quick compiled code when -Xint.410       if (kIsDebugBuild && runtime->GetInstrumentation()->IsForcedInterpretOnly()) {411         DCHECK(!runtime->UseJit());412         CHECK(IsEntrypointInterpreter())413             << "Don't call compiled code when -Xint " << PrettyMethod(this);414       }415 416 #if defined(__LP64__) || defined(__arm__) || defined(__i386__)417       if (!IsStatic()) {418         (*art_quick_invoke_stub)(this, args, args_size, self, result, shorty);419       } else {420         (*art_quick_invoke_static_stub)(this, args, args_size, self, result, shorty);421       }422 #else423       (*art_quick_invoke_stub)(this, args, args_size, self, result, shorty);424 #endif425       if (UNLIKELY(self->GetException() == Thread::GetDeoptimizationException())) {426         // Unusual case where we were running generated code and an427         // exception was thrown to force the activations to be removed from the428         // stack. Continue execution in the interpreter.429         self->ClearException();430         ShadowFrame* shadow_frame =431             self->PopStackedShadowFrame(StackedShadowFrameType::kDeoptimizationShadowFrame);432         result->SetJ(self->PopDeoptimizationReturnValue().GetJ());433         self->SetTopOfStack(nullptr);434         self->SetTopOfShadowStack(shadow_frame);435         interpreter::EnterInterpreterFromDeoptimize(self, shadow_frame, result);436       }437       if (kLogInvocationStartAndReturn) {438         LOG(INFO) << StringPrintf("Returned '%s' quick code=%p", PrettyMethod(this).c_str(),439                                   GetEntryPointFromQuickCompiledCode());440       }441     } else {442       LOG(INFO) << "Not invoking '" << PrettyMethod(this) << "' code=null";443       if (result != nullptr) {444         result->SetJ(0);445       }446     }447   }448 449   // Pop transition.450   self->PopManagedStackFragment(fragment);451 }

ArtMethod类的成员函数Invoke的执行逻辑如下所示:
1. 构造一个类型为ManagedStack的调用栈帧。这些调用栈帧会保存在当前线程对象的一个链表中,在进行垃圾收集会使用到。
2. 如果ART运行时还没有启动,那么这时候是不能够调用任何类方法的,因此就直接返回。否则,继续往下执行。
3. 从前面的函数LinkCode可以知道,无论一个类方法是通过解释器执行,还是直接以本地机器指令执行,均可以通过ArtMethod类的成员函数GetEntryPointFromCompiledCode获得其入口点,并且该入口不为NULL。不过,这里并没有直接调用该入口点,而是通过Stub来间接调用。这是因为我们需要设置一些特殊的寄存器。如果是64位或者arm或者i386架构的:1/不是静态方法,那么调用art_ quick_invoke_stub;2/是静态方法,调用art_quick_invoke_static_stub,否则调用art_quick_invoke_ stub。由于我们考虑的是arm64架构的,所以会调用art_quick_invoke_ stub或者art_ quick_ invoke_static_stub。
4. 如果在执行类方法的过程中,出现了一个值为-1的异常,那么就在运行生成的本地机器指令出现了问题,这时候就通过解释器来继续执行。每次通过解释器执行一个类方法的时候,都需要构造一个类型为ShadowFrame的调用栈帧。这些调用栈帧同样是在垃圾回收时使用到。
接下来我们主要是分析第3步,并且假设目标CPU体系架构为ARM64,这样第3步使用的Stub就为函数art_quick_invoke_stub和art_quick_invoke_static_stub,它们的实现如下所示:

// art/runtime/arch/arm64/quick_entrypoints_arm64.S 510 .macro INVOKE_STUB_CREATE_FRAME 511  512 SAVE_SIZE=15*8   // x4, x5, x19, x20, x21, x22, x23, x24, x25, x26, x27, x28, SP, LR, FP saved. 513 SAVE_SIZE_AND_METHOD=SAVE_SIZE+8 514  515  516     mov x9, sp                             // Save stack pointer. 517     .cfi_register sp,x9 518  519     add x10, x2, # SAVE_SIZE_AND_METHOD    // calculate size of frame. 520     sub x10, sp, x10                       // Calculate SP position - saves + ArtMethod* + args 521     and x10, x10, # ~0xf                   // Enforce 16 byte stack alignment. 522     mov sp, x10                            // Set new SP. 523  524     sub x10, x9, #SAVE_SIZE                // Calculate new FP (later). Done here as we must move SP 525     .cfi_def_cfa_register x10              // before this. 526     .cfi_adjust_cfa_offset SAVE_SIZE 527  528     str x28, [x10, #112] 529     .cfi_rel_offset x28, 112 530  531     stp x26, x27, [x10, #96] 532     .cfi_rel_offset x26, 96 533     .cfi_rel_offset x27, 104 534  535     stp x24, x25, [x10, #80] 536     .cfi_rel_offset x24, 80 537     .cfi_rel_offset x25, 88 538  539     stp x22, x23, [x10, #64] 540     .cfi_rel_offset x22, 64 541     .cfi_rel_offset x23, 72 542  543     stp x20, x21, [x10, #48] 544     .cfi_rel_offset x20, 48 545     .cfi_rel_offset x21, 56 546  547     stp x9, x19, [x10, #32]                // Save old stack pointer and x19. 548     .cfi_rel_offset sp, 32 549     .cfi_rel_offset x19, 40 550  551     stp x4, x5, [x10, #16]                 // Save result and shorty addresses. 552     .cfi_rel_offset x4, 16 553     .cfi_rel_offset x5, 24 554  555     stp xFP, xLR, [x10]                    // Store LR & FP. 556     .cfi_rel_offset x29, 0 557     .cfi_rel_offset x30, 8 558  559     mov xFP, x10                           // Use xFP now, as it's callee-saved. 560     .cfi_def_cfa_register x29 561     mov xSELF, x3                          // Move thread pointer into SELF register. 562  563     // Copy arguments into stack frame. 564     // Use simple copy routine for now. 565     // 4 bytes per slot. 566     // X1 - source address 567     // W2 - args length 568     // X9 - destination address. 569     // W10 - temporary 570     add x9, sp, #8                         // Destination address is bottom of stack + null. 571  572     // Use \@ to differentiate between macro invocations. 573 .LcopyParams\@: 574     cmp w2, #0 575     beq .LendCopyParams\@ 576     sub w2, w2, #4      // Need 65536 bytes of range. 577     ldr w10, [x1, x2] 578     str w10, [x9, x2] 579  580     b .LcopyParams\@ 581  582 .LendCopyParams\@: 583  584     // Store null into ArtMethod* at bottom of frame. 585     str xzr, [sp] 586 .endm 657  *  extern"C" void art_quick_invoke_stub(ArtMethod *method,   x0 658  *                                       uint32_t  *args,     x1 659  *                                       uint32_t argsize,    w2 660  *                                       Thread *self,        x3 661  *                                       JValue *result,      x4 662  *                                       char   *shorty);     x5 663  *  +----------------------+ 664  *  |                      | 665  *  |  C/C++ frame         | 666  *  |       LR''           | 667  *  |       FP''           | <- SP' 668  *  +----------------------+ 669  *  +----------------------+ 670  *  |        x28           | <- TODO: Remove callee-saves. 671  *  |         :            | 672  *  |        x19           | 673  *  |        SP'           | 674  *  |        X5            | 675  *  |        X4            |        Saved registers 676  *  |        LR'           | 677  *  |        FP'           | <- FP 678  *  +----------------------+ 679  *  | uint32_t out[n-1]    | 680  *  |    :      :          |        Outs 681  *  | uint32_t out[0]      | 682  *  | ArtMethod*           | <- SP  value=null 683  *  +----------------------+ 684  * 685  * Outgoing registers: 686  *  x0    - Method* 687  *  x1-x7 - integer parameters. 688  *  d0-d7 - Floating point parameters. 689  *  xSELF = self 690  *  SP = & of ArtMethod* 691  *  x1 = "this" pointer. 692  * 693  */ 694 ENTRY art_quick_invoke_stub 695     // Spill registers as per AACPS64 calling convention. 696     INVOKE_STUB_CREATE_FRAME 697  698     // Fill registers x/w1 to x/w7 and s/d0 to s/d7 with parameters. 699     // Parse the passed shorty to determine which register to load. 700     // Load addresses for routines that load WXSD registers. 701     adr  x11, .LstoreW2 702     adr  x12, .LstoreX2 703     adr  x13, .LstoreS0 704     adr  x14, .LstoreD0 705  706     // Initialize routine offsets to 0 for integers and floats. 707     // x8 for integers, x15 for floating point. 708     mov x8, #0 709     mov x15, #0 710  711     add x10, x5, #1         // Load shorty address, plus one to skip return value. 712     ldr w1, [x9],#4         // Load "this" parameter, and increment arg pointer. 713  714     // Loop to fill registers. 715 .LfillRegisters: 716     ldrb w17, [x10], #1       // Load next character in signature, and increment. 717     cbz w17, .LcallFunction   // Exit at end of signature. Shorty 0 terminated. 718  719     cmp  w17, #'F' // is this a float? 720     bne .LisDouble 721  722     cmp x15, # 8*12         // Skip this load if all registers full. 723     beq .Ladvance4 724  725     add x17, x13, x15       // Calculate subroutine to jump to. 726     br  x17 727  728 .LisDouble: 729     cmp w17, #'D'           // is this a double? 730     bne .LisLong 731  732     cmp x15, # 8*12         // Skip this load if all registers full. 733     beq .Ladvance8 734  735     add x17, x14, x15       // Calculate subroutine to jump to. 736     br x17 737  738 .LisLong: 739     cmp w17, #'J'           // is this a long? 740     bne .LisOther 741  742     cmp x8, # 6*12          // Skip this load if all registers full. 743     beq .Ladvance8 744  745     add x17, x12, x8        // Calculate subroutine to jump to. 746     br x17 747  748 .LisOther:                  // Everything else takes one vReg. 749     cmp x8, # 6*12          // Skip this load if all registers full. 750     beq .Ladvance4 751  752     add x17, x11, x8        // Calculate subroutine to jump to. 753     br x17 754  755 .Ladvance4: 756     add x9, x9, #4 757     b .LfillRegisters 758  759 .Ladvance8: 760     add x9, x9, #8 761     b .LfillRegisters 762  763 // Macro for loading a parameter into a register. 764 //  counter - the register with offset into these tables 765 //  size - the size of the register - 4 or 8 bytes. 766 //  register - the name of the register to be loaded. 767 .macro LOADREG counter size register return 768     ldr \register , [x9], #\size 769     add \counter, \counter, 12 770     b \return 771 .endm 772  773 // Store ints. 774 .LstoreW2: 775     LOADREG x8 4 w2 .LfillRegisters 776     LOADREG x8 4 w3 .LfillRegisters 777     LOADREG x8 4 w4 .LfillRegisters 778     LOADREG x8 4 w5 .LfillRegisters 779     LOADREG x8 4 w6 .LfillRegisters 780     LOADREG x8 4 w7 .LfillRegisters 781  782 // Store longs. 783 .LstoreX2: 784     LOADREG x8 8 x2 .LfillRegisters 785     LOADREG x8 8 x3 .LfillRegisters 786     LOADREG x8 8 x4 .LfillRegisters 787     LOADREG x8 8 x5 .LfillRegisters 788     LOADREG x8 8 x6 .LfillRegisters 789     LOADREG x8 8 x7 .LfillRegisters 790  791 // Store singles. 792 .LstoreS0: 793     LOADREG x15 4 s0 .LfillRegisters 794     LOADREG x15 4 s1 .LfillRegisters 795     LOADREG x15 4 s2 .LfillRegisters 796     LOADREG x15 4 s3 .LfillRegisters 797     LOADREG x15 4 s4 .LfillRegisters 798     LOADREG x15 4 s5 .LfillRegisters 799     LOADREG x15 4 s6 .LfillRegisters 800     LOADREG x15 4 s7 .LfillRegisters 801  802 // Store doubles. 803 .LstoreD0: 804     LOADREG x15 8 d0 .LfillRegisters 805     LOADREG x15 8 d1 .LfillRegisters 806     LOADREG x15 8 d2 .LfillRegisters 807     LOADREG x15 8 d3 .LfillRegisters 808     LOADREG x15 8 d4 .LfillRegisters 809     LOADREG x15 8 d5 .LfillRegisters 810     LOADREG x15 8 d6 .LfillRegisters 811     LOADREG x15 8 d7 .LfillRegisters 812  813  814 .LcallFunction: 815  816     INVOKE_STUB_CALL_AND_RETURN 817  818 END art_quick_invoke_stub

函数art_quick_invoke_ stub前面的注释列出了 函数art_ quick_ invoke_stub被调用的时候,寄存器X0-X5的值,以及调用栈顶端的两个值。其中,X0指向当前被调用的类方法,X1指向一个参数数组地址,W2记录参数数组的大小,X3指向当前线程。调用栈顶端的两个元素分别用来保存调用结果及其类型。
无论一个类方法是通过解释器执行,还是直接以本地机器指令执行,当它被调用时,都有着特殊的调用约定。其中,寄存器xSELF(x18)指向用来描述当前调用线程的一个Thread对象地址,这样本地机器指令在执行的过程中,就可以通过它来定位线程的相关信息,例如我们在前面描述的各种函数跳转表;寄存器r4初始化为一个计数值,当计数值递减至0时,就需要检查当前线程是否已经被挂起;寄存器x0指向用来描述被调用类方法的一个ArtMethod对象地址。
所有传递给被调用方法的参数都会保存在调用栈中,因此,在进入类方法的入口点之前,需要在栈中预留足够的位置,并且通过调用memcpy函数将参数都拷贝到预留的栈位置去。同时,前面7个参数还会额外地保存在寄存器x1-x7中。这样对于小于等于7个参数的类方法,就可以通过访问寄存器来快速地获得参数。
注意,传递给被调用类方法的参数并不是从栈顶第一个位置(一个位置等于一个字长,即8个字节)开始保存的,而是从第二个位置开始的,即sp + 8。这是因为栈顶的第一个位置是预留用来保存用来描述当调用类方法(Caller)的ArtMethod对象地址的。由于函数art_quick_invoke_stub是用来从外部进入到ART运行时的,即不存在调用类方法,因此这时候栈顶第一个位置会被设置为NULL。
准备好调用栈帧之后,就找到从用来描述当前调用类方法的ArtMethod对象地址偏移ART_METHOD_QUICK_CODE_OFFSET_64处的值,并且以该值作为类方法的执行入口点,最后通过blr指令跳过去执行。

//~/android-6.0.1_r62/art/runtime/asm_support.h中:196 #define ART_METHOD_QUICK_CODE_OFFSET_64 48197 ADD_TEST_EQ(ART_METHOD_QUICK_CODE_OFFSET_64,198             art::ArtMethod::EntryPointFromQuickCompiledCodeOffset(8).Int32Value())

以下为INVOKE_STUB_CREATE_FRAME宏的一些理解:
INVOKE_STUB_CREATE_FRAME为一个宏,负责按照AACPS64decalling convention来spill registers。
1.确定要保存的数据的大小SAVE_SIZE=15*8 和SAVE_SIZE_AND_METHOD=SAVE_SIZE+8。
2.保存栈指针,将sp寄存器的内容存入x9寄存器中。
注意:伪指令是不参与CPU运行的,只指导编译链接过程。比如,代码中以“.cfi”开头的伪指令是辅助汇编器创建栈帧(stack frame)信息的。
3.计算栈帧的大小,将x2寄存器中的内容加上SAVE_SIZE_AND_METHOD的值,存入x10寄存器中。
计算sp指针的位置,saves+ArtMethod*+args.寄存器sp减去x10寄存器中的值,存入x10中。
强制进行16字节栈对齐。
将x10的值复制到sp寄存器中,即设置新的栈指针。
计算新的FP,将x9即原栈指针的值-#SAVE_SIZE的值,存入x10中。
4.将寄存器的值存入栈中,对应的是:
x28存入sp+112的地方,x27存入sp+104,x26存入sp+96,….x19存入sp+40,x9(原sp地址)存入sp+32,将x4(Result)存入sp+16, x5(shorty addresses)存入sp+24,将X30(LR)存入sp+8,x29(FP)存入sp+0。将x10也就是现在的sp的值复制到xFP(x29), 将x3中的值复制到xSELF(x18)。
5.以下是将arguments存入栈帧中,每个slot是4字节,x1为args的地址,w2为args的长度,x9为destination address,w10是temporary。
将sp的值+8存入x9中
进入循环,判断w2是否为0,若为0,则跳转到.LendCopyParams\@ :将xzr(0)存入sp中
否则,w2=w2-4,用x1中的值(args的地址)+x2的值(args)得到的地址,取其中的值存入w10中,然后将w10的值存入x9+x2得到的地址中,一直到w2为0.
形式如:

663 * +———————-+
664 * | |
665 * | C/C++ frame |
666 * | LR” |
667 * | FP” | <- SP’
668 * +———————-+
669 * +———————-+
670 * | x28 | <- TODO: Remove callee-saves.
671 * | : |
672 * | x19 |
673 * | SP’ |
674 * | X5 |
675 * | X4 | Saved registers
676 * | LR’ |
677 * | FP’ | <- FP
678 * +———————-+
679 * | uint32_t out[n-1] |
680 * | : : | Outs
681 * | uint32_t out[0] |
682 * | ArtMethod* | <- SP value=null
683 * +———————-+

原创粉丝点击