ART异常处理机制(1)

来源:互联网 发布:rapidgator是什么软件 编辑:程序博客网 时间:2024/05/17 06:27

本文主要介绍 ART异常处理,ART对SIGSEGV信号的拦截处理,Implicit Suspend Check的实现,以及一般的 Java Exception在ART种的检测和抛出。由于 StackOverflowError / NullPointerException的检测抛出,throw-catch的实现比较复杂,开始写到一篇文章内,发现文章太长了,后来把这3个比较复杂的处理拆分出来单独列出了。

ART异常处理机制(2) - StackOverflowError 实现

ART异常处理机制(3) - NullPointerException实现

ART异常处理机制(4) - throw & catch & finally实现

实际上 ART 种的主要的两种Exception的处理都是通过产生 SIGSEGV信号好拦截SIGSEGV信号进行实现的。所以我们接下来先弄明白 ART种对 SIGSEGV信号的拦截的处理流程。

1. FaultManager的初始化

ART中处理linux信号是通过 FaultManger来处理,对于特定的信号,先经过ART中的信号处理函数 art_fault_handler进行处理,在ART中能够识别的情况下,把这些信号转换为Java工程师能够识别的 Java Exception 抛出,以便于工程师处理异常。
下面我们看下 FaultManger的实现。实际上 FaultManger实在虚拟机启动的时候,就完成了初始化,在虚拟机启动完成,即可立即处理 Java Exception。
在 runtime.cc 的 Runtime::Init 函数:
bool Runtime::Init(RuntimeArgumentMap&& runtime_options_in) {   ...    fault_manager.Init();      if (implicit_suspend_checks_) {        new SuspensionHandler(&fault_manager);      }      if (implicit_so_checks_) {        new StackOverflowHandler(&fault_manager);      }      if (implicit_null_checks_) {        new NullPointerHandler(&fault_manager);      }      if (kEnableJavaStackTraceHandler) {        new JavaStackTraceHandler(&fault_manager);      }   ...}
其中 fault_manager.Init() 初始化,会通过 signal 信号处理函数,设置拦截几个特定的信号,通过对这些信号进行特殊处理,来实现 Java Exception;下面的几个Handler的创建,实际都是通过构造函数,将自己添加到 fault_manager 的信号处理 handler 集合中,以便后续处理特定信号。
void FaultManager::Init() {  CHECK(!initialized_);  sigset_t mask;  sigfillset(&mask);  sigdelset(&mask, SIGABRT);  sigdelset(&mask, SIGBUS);  sigdelset(&mask, SIGFPE);  sigdelset(&mask, SIGILL);  sigdelset(&mask, SIGSEGV);  SigchainAction sa = {    .sc_sigaction = art_fault_handler,    .sc_mask = mask,    .sc_flags = 0UL,  };  AddSpecialSignalHandlerFn(SIGSEGV, &sa);  initialized_ = true;}
当前 mask中,包含 SIGABRT,SIGBUS,SIGFPE,SIGILL,SIGSEGV 之外的所有信号。
而 AddSpecialSignalHandlerFn中,只传递了 SIGSEGV过去。
extern "C" void AddSpecialSignalHandlerFn(int signal, SigchainAction* sa) {  InitializeSignalChain();  if (signal <= 0 || signal >= _NSIG) {    fatal("Invalid signal %d", signal);  }  // Set the managed_handler.  chains[signal].AddSpecialHandler(sa);  chains[signal].Claim(signal);}
其中 chains是一个数组,其长度是linux 信号的个数: static SignalChain chains[_NSIG];
在 InitializeSignalChain函数中,获取 sigchainlib 里的 sigaction 和 sigprocmask 函数,以便后续使用:
__attribute__((constructor)) static void InitializeSignalChain() {   ...    void* linked_sigaction = dlsym(RTLD_NEXT, "sigaction");    void* linked_sigprocmask = dlsym(RTLD_NEXT, "sigprocmask");   ...}
记得Android之前的版本 libsigchain.so 是在 initrc 中通过 LD_PRELOAD 添加到 ldpath 中的,后来好像改了,修改后还没研究过。
在 sigchain lib中实现了自己的 sigaction和 sigprocmask函数,通过类似 LD_PRELOAD 的手段,把 libsigchain.so 添加到 ldpath;使得在调用 sigaction 函数以及 sigprocmask时,会调用 libsigchain的这两个函数,而不是 libc的这两个函数。
这里通过 dlsym(RTLD_NEXT,"***"),来获取 libc的这两个函数的指针,后续使用。
简单来讲,就相当于 hook 了 libc的这两个函数,使得调用这两个函数的地方都会进入 libsigchain 实现的 sigaction和sigprocmask函数内。
extern "C" int sigaction(int signal, const struct sigaction* new_action, struct sigaction* old_action) {  InitializeSignalChain();  if (signal < 0 || signal >= _NSIG) {    errno = EINVAL;    return -1;  }  if (chains[signal].IsClaimed()) {    struct sigaction saved_action = chains[signal].GetAction();    if (new_action != nullptr) {      chains[signal].SetAction(new_action);    }    if (old_action != nullptr) {      *old_action = saved_action;    }    return 0;  }  // Will only get here if the signal chain has not been claimed.  We want  // to pass the sigaction on to the kernel via the real sigaction in libc.  return linked_sigaction(signal, new_action, old_action);}
从代码中看到,通过 sigaction设置的信号会调用 libsigchain 的sigaction函数路径设置new action,如果时我们关注的信号,则没有真正设置new action到kernel,而是将其存放到该信号对应的SignalChain对应的 action_成员,用以记录 old_action,并返回该信号原来的 saved_action;若不是我们关注的信号,则还走 libc的 sigaction 函数,会真正设置new action到kernel。
同时也 hook 了 signal() 函数,目的与 hook sigaction函数一样,只不过,当走默认路径时,并不是使用libc的 signal()函数,而是也使用 libc的sigaction函数。
sigchainlib中实现的 sigprocmask也类似,目的是当有程序调用 sigprocmask设置 SIG_BLOCK 要阻塞我们关注的 signal时,要把我们关注的 signal从信号掩码中去除掉,以便影响我们的功能。
接下来的 chains[signal].AddSpecialHandlers(sa),和 Claim(signal) :
    SigchainAction special_handlers_[2];
void AddSpecialHandler(SigchainAction* sa) {    for (SigchainAction& slot : special_handlers_) {      if (slot.sc_sigaction == nullptr) {        slot = *sa;        return;      }    }    fatal("too many special signal handlers");  }
这里把两个 special handler的 SigChainAction 都设置为前面 FaultManger::Init 函数中初始化的 SigChainAction,其中sc_sigaction = art_fault_handler
Claim(SIGSEGV):
  void Claim(int signo) {    if (!claimed_) {      Register(signo);      claimed_ = true;    }  }  void Register(int signo) {    struct sigaction handler_action = {};    handler_action.sa_sigaction = SignalChain::Handler;    handler_action.sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK;    sigfillset(&handler_action.sa_mask);    linked_sigaction(signo, &handler_action, &action_);  }

可以看到在调用 Claim 函数 Register信号时,调用用过了前面的 linked_sigaction,其实就是 libc的  sigaction()函数。这里指定了 SIGSEGV信号的信号处理函数,即 void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) 函数来处理 SIGSEGV 信号。

总的来讲,fault_manager.Init()函数通过 sigchain中函数,设置了 SIGSEGV 信号的处理函数为 SignalChain::Handler函数,并 hook 了 libc 的 sigaction(),sigprocmask(), signal(),以及32bit时的 bsd_signal()这四个函数,防止被其他程序破坏我们的设置。

在 SignalChain::Handler() 函数中,会先调用 art_fault_handler 来先尝试处理 SIGSEGV信号,如果处理不了,会再使用 saved sigaction(default 或者应用设置的sigaction)来处理这个信号。
回到 Runtime::Init()函数,fault_manager.Init()后面的 new SuspensionHandler(&fault_manager);几条语句,实际在这几个Handler的构造函数中,把它们各自都添加到了FaultManager的成员 generated_code_handlers_集合中,后面在 art_fault_hander函数中会使用这几个 Handler 尝试处理 SIGSEGV:
比如:
NullPointerHandler::NullPointerHandler(FaultManager* manager) : FaultHandler(manager) {  manager_->AddHandler(this, true);}
传递的第二个参数是 true:
void FaultManager::AddHandler(FaultHandler* handler, bool generated_code) {  DCHECK(initialized_);  if (generated_code) {    generated_code_handlers_.push_back(handler);  } else {    other_handlers_.push_back(handler);  }}
到这里,已经把 NullPointerHander 添加到 generated_code_handlers_集合中;
再看 art_fault_hander 函数:
static bool art_fault_handler(int sig, siginfo_t* info, void* context) {  return fault_manager.HandleFault(sig, info, context);}
bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) {  ...  if (IsInGeneratedCode(info, context, true)) {    for (const auto& handler : generated_code_handlers_) {      VLOG(signals) << "invoking Action on handler " << handler;      if (handler->Action(sig, info, context)) {        return true;      }    }   ...}

总结:FaultManger 初始化完成了两件事情:
  1. 设置 SIGSEGV 信号必须先通过 ART 处理
  2. ART 处理 SIGSEGV时,在 art_fault_handler 函数中主要先通过 generated_code_handlers_ 进行处理
  3. 把 NullPointerHander 等几个 Handler 添加到 generated_code_handlers_ 
所以,总的来讲,ART 中对 Java Exception的支持完全是通过 SIGSEGV 这个信号实现的。

2. ART 中对 SIGSEGV 信号的处理

前面已经知道,SIGSEGV信号会通过 SignalChain::Handler 函数处理:
void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) {  if (!GetHandlingSignal()) {    for (const auto& handler : chains[signo].special_handlers_) {      if (handler.sc_sigaction == nullptr) {        break;      }      bool handler_noreturn = (handler.sc_flags & SIGCHAIN_ALLOW_NORETURN);      sigset_t previous_mask;      linked_sigprocmask(SIG_SETMASK, &handler.sc_mask, &previous_mask);      ScopedHandlingSignal restorer;      if (!handler_noreturn) {        SetHandlingSignal(true);      }      if (handler.sc_sigaction(signo, siginfo, ucontext_raw)) {        return;      }      linked_sigprocmask(SIG_SETMASK, &previous_mask, nullptr);    }  }  // Forward to the user's signal handler.  int handler_flags = chains[signo].action_.sa_flags;  ucontext_t* ucontext = static_cast<ucontext_t*>(ucontext_raw);  sigset_t mask;  sigorset(&mask, &ucontext->uc_sigmask, &chains[signo].action_.sa_mask);  if (!(handler_flags & SA_NODEFER)) {    sigaddset(&mask, signo);  }  linked_sigprocmask(SIG_SETMASK, &mask, nullptr);  if ((handler_flags & SA_SIGINFO)) {    chains[signo].action_.sa_sigaction(signo, siginfo, ucontext_raw);  } else {    auto handler = chains[signo].action_.sa_handler;    if (handler == SIG_IGN) {      return;    } else if (handler == SIG_DFL) {      fatal("exiting due to SIG_DFL handler for signal %d", signo);    } else {      handler(signo);    }  }}
这个函数的功能:
  1. 如果当前线程没有正在处理信号,则尝试使用 special hander的 sc_sigaction 函数来处理该信号,即使用 art_fault_handler函数尝试处理 SIGSEGV
  2. 如果 art_fault_handler 能够处理当前信号,则处理完成后 return
  3. 如果不能处理当前信号,则会调用 SIGSEGV信号对应的 SignalChain中保存的 saved action(action_) 来处理这个信号,即如果应用程序设置过该信号的处理函数,则调用其,如果没有应该会走 linker 中设置的 sigaction,最终走到 debuggerd处理该信号。

所以,一般情况下,收到 SIGSEGV信号后,先走到当前函数,然后走到 art_fault_hanlder 函数:

// Signal handler called on SIGSEGV.static bool art_fault_handler(int sig, siginfo_t* info, void* context) {  return fault_manager.HandleFault(sig, info, context);}
bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) {  VLOG(signals) << "Handling fault";#ifdef TEST_NESTED_SIGNAL  // Simulate a crash in a handler.  raise(SIGSEGV);#endif  if (IsInGeneratedCode(info, context, true)) {    VLOG(signals) << "in generated code, looking for handler";    for (const auto& handler : generated_code_handlers_) {      VLOG(signals) << "invoking Action on handler " << handler;      if (handler->Action(sig, info, context)) {        // We have handled a signal so it's time to return from the        // signal handler to the appropriate place.        return true;      }    }    // We hit a signal we didn't handle.  This might be something for which    // we can give more information about so call all registered handlers to    // see if it is.    if (HandleFaultByOtherHandlers(sig, info, context)) {      return true;    }  }  // Set a breakpoint in this function to catch unhandled signals.  art_sigsegv_fault();  return false;}

可以看到,在HandleFault中,会先通过 IsInGeneratedCode() 判断当前的 SIGSEGV是否是发生在 generated code中,也就是判断是否是在从 java 代码编译出来的 native code中,如果是的话,才会依次使用 generated_code_handlers_ 以及 other handlers尝试处理该 SIGSEGV信号。

bool FaultManager::IsInGeneratedCode(siginfo_t* siginfo, void* context, bool check_dex_pc) {  ...  ThreadState state = thread->GetState();  if (state != kRunnable) {    return false;  }  if (!Locks::mutator_lock_->IsSharedHeld(thread)) {    return false;  }  GetMethodAndReturnPcAndSp(siginfo, context, &method_obj, &return_pc, &sp);  const OatQuickMethodHeader* method_header = method_obj->GetOatQuickMethodHeader(return_pc);  uint32_t dexpc = method_header->ToDexPc(method_obj, return_pc, false);  return !check_dex_pc || dexpc != DexFile::kDexNoIndex;}
这里把这个函数的关键代码展示出来,判断是否在 generated code中:
  1. 如果SIGSEGV发生在generated code中,则当前线程肯定是 kRunnable状态,且持有 mutator_lock_
  2. 根据当前 context 尝试获取对应的 ArtMethod,如果发生在 generated code中,则肯定能获取成功
  3. 根据ArtMethod获取当前 SIGSEGV发生位置对应的 dex_pc,如果发生在 generated code中,也应该能够获取成功
假设当前处理的这个SIGSEGV就是发生在 generated code中,那么接下来,依次通过 generated_code_handlers_ 和 other handlers的Action函数尝试处理该信号。
generated_code_handlers_ 中依次是如下这几个handler:
      if (implicit_suspend_checks_) {        new SuspensionHandler(&fault_manager);      }      if (implicit_so_checks_) {        new StackOverflowHandler(&fault_manager);      }      if (implicit_null_checks_) {        new NullPointerHandler(&fault_manager);      }
而other_handlers_中只有一个handler:
      if (kEnableJavaStackTraceHandler) {        new JavaStackTraceHandler(&fault_manager);      }
所以,一个 SIGSEGV信号来了之后,上面的这四个handler处理是有优先级的,就是它们的顺序。
每个handler尝试处理时,都是通过各自的 Action函数,从当前 context中获取一定信息,判断是否匹配各自期待的信息,如果匹配,则就能够处理当前这个 SIGSEGV,返回 true,后面的handler就不再需要处理了;否则继续交由下一个handler尝试处理;这些handler都不能处理的话,最终再交给默认的处理函数,最终走到debuggerd。
另外,看到这几个Handler的添加都是有条件的,拿一个7.0的手机看了下,这几个开关的值分别是:
(gdb) p 'art::Runtime'::instance_->implicit_suspend_checks_$2 = false(gdb) p 'art::Runtime'::instance_->implicit_so_checks_$3 = true(gdb) p 'art::Runtime'::instance_->implicit_null_checks_$4 = truestatic constexpr bool kEnableJavaStackTraceHandler = false;
所以真正的运行环境中,SIGSEGV 信号只需要先被 StackOverflowHander 和 NullPointerHandler 这俩个handler尝试处理,不能处理,则走到Linker中的处理函数。

3. SuspensionHander 实现

各个handler的实现主要就是在其对应的 Action()函数中,而由于其要获取当前 context的信息,所以这些Action函数是平台相关的,比如x86平台context的处理和arm平台不一样,arm平台上 32bit和64bit对context的处理也不相同。这里我们主要分析 arm 32上的实现。

// A suspend check is done using the following instruction sequence:// 0xf723c0b2: f8d902c0  ldr.w   r0, [r9, #704]  ; suspend_trigger_// .. some intervening instruction// 0xf723c0b6: 6800      ldr     r0, [r0, #0]
这几行注释是 SuspensionHandler实现的原理。

当想要一个线程在generated code中执行的时候进行 suspend check时,实际就是把线程 thread的 suspend_trigger_设置为 nullptr,按照上面的实现,在线程执行 generated code的过程中,会先通过 ldr.w r0,[r9, #704] 获取 suspend_trigger_成员(其中 r9表示thread,704 时 suspend_trigger_成员对应于 thread的offset),然后执行 ldr r0,[r0,#0]来取r0中的数据,而trigger的情况下,suspend_trigger_是0,此时就会触发一个 SIGSEGV,然后走到 SuspensionHandler::Action函数先尝试处理,发现匹配后,就跳转到Suspend Check中。

反之,当不需要进行suspend check时,把 suspend_trigger_的地址赋值给它自己就可以了,此时不会触发SIGSEGV。

trigger suspend 的enable和disable:

  void TriggerSuspend() {    tlsPtr_.suspend_trigger = nullptr;  }
  void RemoveSuspendTrigger() {    tlsPtr_.suspend_trigger = reinterpret_cast<uintptr_t*>(&tlsPtr_.suspend_trigger);  }
其enable有3种情况:
  1. 在 bool Thread::ModifySuspendCountInternal()函数结尾,如果发现更改后的suspend_count大于0,说明当前线程被请求suspend,那么当然是越快越好,此时会调用 TriggerSuspend()函数,以便当前线程执行 generated code过程中进程 Suspend check,从而进入suspend状态
  2. 在 bool Thread::RequestCheckpoint(Closure* function) 函数给一个线程设置Checkpoint function成功后,会调用 TriggerSuspend() 函数,因为被设置了Checkpoint function,也是越快执行越好,trigger后,在suspend check时,会先检查 checkpoint function,如果存在,则立即执行 checkpoint function
  3. 在 bool Thread::RequestEmptyCheckpoint() 函数成功后也会调用 TriggerSuspend();EmptyCheckpoint的功用没有详细了解

知道这些知识点后,再看 SuspensionHandler::Action函数的实现就简单了:

bool SuspensionHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* info ATTRIBUTE_UNUSED,                               void* context) {  // These are the instructions to check for.  The first one is the ldr r0,[r9,#xxx]  // where xxx is the offset of the suspend trigger.  uint32_t checkinst1 = 0xf8d90000      + Thread::ThreadSuspendTriggerOffset<PointerSize::k32>().Int32Value();  uint16_t checkinst2 = 0x6800;  struct ucontext* uc = reinterpret_cast<struct ucontext*>(context);  struct sigcontext *sc = reinterpret_cast<struct sigcontext*>(&uc->uc_mcontext);  uint8_t* ptr2 = reinterpret_cast<uint8_t*>(sc->arm_pc);  uint8_t* ptr1 = ptr2 - 4;  VLOG(signals) << "checking suspend";  uint16_t inst2 = ptr2[0] | ptr2[1] << 8;  VLOG(signals) << "inst2: " << std::hex << inst2 << " checkinst2: " << checkinst2;  if (inst2 != checkinst2) {    // Second instruction is not good, not ours.    return false;  }  uint8_t* limit = ptr1 - 40;   // Compiler will hoist to a max of 20 instructions.  bool found = false;  while (ptr1 > limit) {    uint32_t inst1 = ((ptr1[0] | ptr1[1] << 8) << 16) | (ptr1[2] | ptr1[3] << 8);    VLOG(signals) << "inst1: " << std::hex << inst1 << " checkinst1: " << checkinst1;    if (inst1 == checkinst1) {      found = true;      break;    }    ptr1 -= 2;      // Min instruction size is 2 bytes.  }  if (found) {    sc->arm_lr = sc->arm_pc + 3;      // +2 + 1 (for thumb)    sc->arm_pc = reinterpret_cast<uintptr_t>(art_quick_implicit_suspend);    // Now remove the suspend trigger that caused this fault.    Thread::Current()->RemoveSuspendTrigger();    VLOG(signals) << "removed suspend trigger invoking test suspend";    return true;  }  return false;}
Action函数中,实际就是先判断出发 SIGSEGV的代码是否是 0x6800( ldr r0,[r0,#0]),如果是,才有可能是Suspend Check,然后检查这个代码之前的40个字节之内是否出现了 0xf8d902c0 指令(ldr.w r0,[r9, #704]),至于为什是 40个字节,这个应该跟 thumb指令长度以及 ART 编译java 代码的 code generator相关,还没有研究。

我们暂时跳过这个疑问,假设经过检测后,发现匹配,确实是因为 TriggerSuspend()触发的一个 SIGSEGV信号,那么我们就需要处理这个 SIGSEGV信号了。处理的方式就是通过设置 arm_pc来跳转到隐式的 suspend check处理函数,另外在跳转之前 lr 会设置为 pc+2+1(+2因为当前pc指向的指令是2个字节,+1是因为在从susped check 返回回来后,需要运行在 thumb模式):

    sc->arm_lr = sc->arm_pc + 3;      // +2 + 1 (for thumb)    sc->arm_pc = reinterpret_cast<uintptr_t>(art_quick_implicit_suspend);
下面就进入到了suspend check函数,同样是平台相关的:
ENTRY art_quick_implicit_suspend    mov    r0, rSELF    SETUP_SAVE_REFS_ONLY_FRAME r1             @ save callee saves for stack crawl    bl     artTestSuspendFromCode             @ (Thread*)    RESTORE_SAVE_REFS_ONLY_FRAME_AND_RETURNEND art_quick_implicit_suspend
我们看到,这里实际是跳到了 artTestSuspendFromCode 函数中:
extern "C" void artTestSuspendFromCode(Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) {  // Called when suspend count check value is 0 and thread->suspend_count_ != 0  ScopedQuickEntrypointChecks sqec(self);  self->CheckSuspend();}
然后就到了 thread的 CheckSuspend()函数:
inline void Thread::CheckSuspend() {  DCHECK_EQ(Thread::Current(), this);  for (;;) {    if (ReadFlag(kCheckpointRequest)) {      RunCheckpointFunction();    } else if (ReadFlag(kSuspendRequest)) {      FullSuspendCheck();    } else if (ReadFlag(kEmptyCheckpointRequest)) {      RunEmptyCheckpoint();    } else {      break;    }  }}
可以看到在这个函数里,按照 CheckpointFunction,SuspendCheck,EmptyCheckpoint的优先级进行执行,对应了上面讲到的 3种 TriggerSuspend()的情况。

到这里知道了SuspensionHandler工作的大体流程,但是有一个问题:

这个隐式的suspend check是在 generated code中的怎样的位置,它在怎样的时机执行?要搞明白这个问题,还需要研究隐式的suspend check的设计需求以及code generator生成这种代码的流程。因为隐式的 Suspend Check没有打开,暂不研究了。

因为在 generated code中,并没有安插这类隐式的 suspend check代码。那么使用的suspend check应该就是显示的检查了。在这里简单提一下Suspend Check的场景:

Supend Check 会在java函数的返回时,线程运行状态转换为 kRunnable状态时,以及 kRunnable状态的线程的 thread loop(goto),cmp(if-ge),switch(packed-swtich)这些执行过程,都需要进行suspend check。简单总结就是:1.线程从其他状态切换到 kRunnable状态时需要检查 2.kRunnable状态的线程执行跳转时需要检查

1.运行在Interpreter模式时的 suspend check:

   在各个 suspen check的点执行 MterpSuspendCheck函数来检查是否需要进入suspend 状态。

extern "C" size_t MterpSuspendCheck(Thread* self)    REQUIRES_SHARED(Locks::mutator_lock_) {  self->AllowThreadSuspension();  return MterpShouldSwitchInterpreters();}
inline void Thread::AllowThreadSuspension() {  DCHECK_EQ(Thread::Current(), this);  if (UNLIKELY(TestAllFlags())) {    CheckSuspend();  }  // Invalidate the current thread's object pointers (ObjPtr) to catch possible moving GC bugs due  // to missing handles.  PoisonObjectPointers();}

CheckSuspend()函数在前面已经提到了。

2.generated code中现实的 suspend check:

   suspend check的安插点,需要达到相同的目的,但有些许不同,generated code中的检测代码是compiler 在编译 java method的时候安插进去的:

  4: void java.lang.ThreadLocal$ThreadLocalMap.<init>(java.lang.ThreadLocal$ThreadLocalMap, java.lang.ThreadLocal$ThreadLocalMap) (dex_method_idx=3662)    DEX CODE:      0x0000: 7020 4d0e 1000            | invoke-direct {v0, v1}, void java.lang.ThreadLocal$ThreadLocalMap.<init>(java.lang.ThreadLocal$ThreadLocalMap) // method@3661      0x0003: 0e00                      | return-void    CODE: (code_offset=0x0060aae4 size_offset=0x0060aae0 size=100)...      0x0060aae4: d1400bf0  sub x16, sp, #0x2000 (8192)      0x0060aae8: b940021f  ldr wzr, [x16]        StackMap [native_pc=0x60aaec] (dex_pc=0x0, native_pc_offset=0x8, dex_register_map_offset=0xffffffff, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b00000000000000)      0x0060aaec: f81b0fe0  str x0, [sp, #-80]!      0x0060aaf0: a90357f4  stp x20, x21, [sp, #48]      0x0060aaf4: a9047bf6  stp x22, lr, [sp, #64]      0x0060aaf8: 79400270  ldrh w16, [tr] ; state_and_flags      0x0060aafc: 35000190  cbnz w16, #+0x30 (addr 0x60ab2c)      0x0060ab00: aa0303f4  mov x20, x3      0x0060ab04: aa0103f5  mov x21, x1      0x0060ab08: aa0203f6  mov x22, x2      0x0060ab0c: d0ff6ac0  adrp x0, #-0x12a6000 (addr -0xc9c000)      0x0060ab10: f9428c00  ldr x0, [x0, #1304]      0x0060ab14: f940181e  ldr lr, [x0, #48]      0x0060ab18: d63f03c0  blr lr        StackMap [native_pc=0x60ab1c] (dex_pc=0x0, native_pc_offset=0x38, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x700000, stack_mask=0b00000000000000)          v0: in register (21)  [entry 0]          v1: in register (22)  [entry 1]          v2: in register (20)  [entry 2]      0x0060ab1c: a94357f4  ldp x20, x21, [sp, #48]      0x0060ab20: a9447bf6  ldp x22, lr, [sp, #64]      0x0060ab24: 910143ff  add sp, sp, #0x50 (80)      0x0060ab28: d65f03c0  ret      0x0060ab2c: a9010be1  stp x1, x2, [sp, #16]      0x0060ab30: f90013e3  str x3, [sp, #32]      0x0060ab34: f9426a7e  ldr lr, [tr, #1232] ; pTestSuspend      0x0060ab38: d63f03c0  blr lr        StackMap [native_pc=0x60ab3c] (dex_pc=0x0, native_pc_offset=0x58, dex_register_map_offset=0x3, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b00000101010000)          v0: in stack (16) [entry 3]          v1: in stack (24) [entry 4]          v2: in stack (32) [entry 5]      0x0060ab3c: a9410be1  ldp x1, x2, [sp, #16]      0x0060ab40: f94013e3  ldr x3, [sp, #32]      0x0060ab44: 17ffffef  b #-0x44 (addr 0x60ab00)
在这个函数的 generated code中,函数入口位置 0x0060aaf8进行检查线程的私有数据 stata_and_flags,如果不是0,则需要跳转到0x0060ab2c 处,进行suspend check,可以看到是 跳转到了 [tr, #1232] 处:
(gdb) p (('art::Thread'*)0x7f87e4ea00)->tlsPtr_.quick_entrypoints->pTestSuspend$2 = (void (*)(void)) 0x7f878bab10 <art_quick_test_suspend>(gdb) p &(('art::Thread'*)0x7f87e4ea00)->tlsPtr_.quick_entrypoints->pTestSuspend$3 = (void (**)(void)) 0x7f87e4eed0(gdb) p 0x7f87e4eed0-0x7f87e4ea00$4 = 1232
所以真实的情况是跳转到 thread 的 tlsPtr_.quick_entrypoints->pTestSuspend 函数,而它的值实际是指向 art_quick_test_suspend 函数入口:
ENTRY art_quick_test_suspend#ifdef ARM_R4_SUSPEND_FLAG    ldrh   rSUSPEND, [rSELF, #THREAD_FLAGS_OFFSET]    cbnz   rSUSPEND, 1f                         @ check Thread::Current()->suspend_count_ == 0    mov    rSUSPEND, #SUSPEND_CHECK_INTERVAL    @ reset rSUSPEND to SUSPEND_CHECK_INTERVAL    bx     lr                                   @ return if suspend_count_ == 01:    mov    rSUSPEND, #SUSPEND_CHECK_INTERVAL    @ reset rSUSPEND to SUSPEND_CHECK_INTERVAL#endif    SETUP_SAVE_EVERYTHING_FRAME r0              @ save everything for GC stack crawl    mov    r0, rSELF    bl     artTestSuspendFromCode               @ (Thread*)    RESTORE_SAVE_EVERYTHING_FRAME    bx     lrEND art_quick_test_suspend
最终跳转到 artTestSuspendFromCode函数,接下来就与 art_quick_implicit_suspend 基本相同了。

4. StackOverflowHandler 的实现

从这个Handler的存在,我们知道,Android上对 java stack overflow的检测,也是通过 SIGSEGV实现的。

具体的分析见:ART异常处理机制(2) - StackOverflowError 实现


5. NullPointerHandler 实现

如果 StackOVerflowHandler不能处理这次的 SIGSEGV信号,那么接下来 NullPointerHandler将尝试去处理。

具体分析见:ART异常处理机制(3) - NullPointerException实现


6. JavaStackTraceHandler 实现

看下其代码:
bool JavaStackTraceHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* siginfo, void* context) {  // Make sure that we are in the generated code, but we may not have a dex pc.  bool in_generated_code = manager_->IsInGeneratedCode(siginfo, context, false);  if (in_generated_code) {    LOG(ERROR) << "Dumping java stack trace for crash in generated code";    ArtMethod* method = nullptr;    uintptr_t return_pc = 0;    uintptr_t sp = 0;    Thread* self = Thread::Current();    manager_->GetMethodAndReturnPcAndSp(siginfo, context, &method, &return_pc, &sp);    // Inside of generated code, sp[0] is the method, so sp is the frame.    self->SetTopOfStack(reinterpret_cast<ArtMethod**>(sp));    self->DumpJavaStack(LOG_STREAM(ERROR));  }  return false;  // Return false since we want to propagate the fault to the main signal handler.}
从实现上看,
  1. 只要 SIGSEGV发生在 generated code,就会DumpJavaStack,目的是方便用来分析
  2. 无论有没有dump java stack,都会返回 false,相当于不消费这个 SIGSEGV,最终仍然交给 main signal handler处理
这个 Handler 实现比较简单,其目的是:当 generated code 中发生 SIGSEGV 后,前面的几个handler都没有能够处理的情况下,打印一下 java stack trace,便于提供更多直观的信息。

7. 其他类型的 Java Exception的实现

7.1 ArrayIndexOutOfBoundsException

贴一段访问 array 数据时检测 IndexOutOfBounds 的代码,以分析这个 Exception的实现。
Java 代码:
    public void setPropertyName(@NonNull String propertyName) {        // mValues could be null if this is being constructed piecemeal. Just record the        // propertyName to be used later when setValues() is called if so.        if (mValues != null) {            PropertyValuesHolder valuesHolder = mValues[0];            String oldName = valuesHolder.getPropertyName();            valuesHolder.setPropertyName(propertyName);            mValuesMap.remove(oldName);            mValuesMap.put(propertyName, valuesHolder);        }        mPropertyName = propertyName;        // New property/values/target should cause re-initialization prior to starting        mInitialized = false;    }

DEX CODE:

  40: void android.animation.ObjectAnimator.setPropertyName(java.lang.String) (dex_method_idx=1461)    DEX CODE:      0x0000: 1203                     | const/4 v3, #+0      0x0001: 5442 5316                | iget-object v2, v4, [Landroid/animation/PropertyValuesHolder; android.animation.ObjectAnimator.mValues // field@5715      0x0003: 3802 1700                | if-eqz v2, +23      0x0005: 5442 5316                | iget-object v2, v4, [Landroid/animation/PropertyValuesHolder; android.animation.ObjectAnimator.mValues // field@5715      0x0007: 4601 0203                | aget-object v1, v2, v3      0x0009: 6e10 4406 0100           | invoke-virtual {v1}, java.lang.String android.animation.PropertyValuesHolder.getPropertyName() // method@1604      0x000c: 0c00                     | move-result-object v0      0x000d: 6e20 7106 5100           | invoke-virtual {v1, v5}, void android.animation.PropertyValuesHolder.setPropertyName(java.lang.String) // method@1649      0x0010: 5442 5416                | iget-object v2, v4, Ljava/util/HashMap; android.animation.ObjectAnimator.mValuesMap // field@5716      0x0012: 6e20 82fc 0200           | invoke-virtual {v2, v0}, java.lang.Object java.util.HashMap.remove(java.lang.Object) // method@64642      0x0015: 5442 5416                | iget-object v2, v4, Ljava/util/HashMap; android.animation.ObjectAnimator.mValuesMap // field@5716      0x0017: 6e30 80fc 5201           | invoke-virtual {v2, v5, v1}, java.lang.Object java.util.HashMap.put(java.lang.Object, java.lang.Object) // method@64640      0x001a: 5b45 5116                | iput-object v5, v4, Ljava/lang/String; android.animation.ObjectAnimator.mPropertyName // field@5713      0x001c: 5c43 4f16                | iput-boolean v3, v4, Z android.animation.ObjectAnimator.mInitialized // field@5711      0x001e: 0e00                     | return-void

QUICK CODE:

    CODE: (code_offset=0x01aef425 size_offset=0x01aef420 size=176)...      0x01aef424: f5ad5c00sub     r12, sp, #8192      0x01aef428: f8dcc000ldr.w   r12, [r12, #0]        StackMap [native_pc=0x1aef42d] (dex_pc=0x0, native_pc_offset=0x8, dex_register_map_offset=0xffffffff, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000)      0x01aef42c: e92d4de0push    {r5, r6, r7, r8, r10, r11, lr}      0x01aef430: b089    sub     sp, sp, #36      0x01aef432: 9000    str     r0, [sp, #0]      0x01aef434: f8b9c000ldrh.w  r12, [r9, #0]  ; state_and_flags      0x01aef438: f1bc0f00cmp.w   r12, #0      0x01aef43c: d13d    bne     +122 (0x01aef4ba)      0x01aef43e: 6a4d    ldr     r5, [r1, #36]  ; r1是 this,这里 r5 是 mValues      0x01aef440: 2d00    cmp     r5, #0      0x01aef442: d02a    beq     +84 (0x01aef49a)      0x01aef444: 460f    mov     r7, r1      0x01aef446: 4690    mov     r8, r2      0x01aef448: 2600    movs    r6, #0 ; 这个 0 是 mValues[0] 的下标index 0      0x01aef44a: 68a8    ldr     r0, [r5, #8]  ;这里应该是获取 mValues的size        StackMap [native_pc=0x1aef44d] (dex_pc=0x7, native_pc_offset=0x28, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000)          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef44c: 4286    cmp     r6, r0 ;比较 index(0) 和 mValues 的 size      0x01aef44e: d23c    bcs     +120 (0x01aef4ca) ; 若index(0)大于等于 mValues的size,择跳转到 0x01aef4ca 抛出异常      0x01aef450: 68e9    ldr     r1, [r5, #12]      0x01aef452: 468a    mov     r10, r1      0x01aef454: 6808    ldr     r0, [r1, #0]        StackMap [native_pc=0x1aef457] (dex_pc=0x9, native_pc_offset=0x32, dex_register_map_offset=0x3, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000)          v1: in register (1)[entry 4]          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef456: f8d000c0ldr.w   r0, [r0, #192]      0x01aef45a: f8d0e020ldr.w   lr, [r0, #32]      0x01aef45e: 47f0    blx     lr        StackMap [native_pc=0x1aef461] (dex_pc=0x9, native_pc_offset=0x3c, dex_register_map_offset=0x7, inline_info_offset=0xffffffff, register_mask=0x5a0, stack_mask=0b0000000000)          v1: in register (10)[entry 5]          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef460: 4642    mov     r2, r8      0x01aef462: 4651    mov     r1, r10      0x01aef464: 4683    mov     r11, r0      0x01aef466: 6808    ldr     r0, [r1, #0]      0x01aef468: f8d000f0ldr.w   r0, [r0, #240]      0x01aef46c: f8d0e020ldr.w   lr, [r0, #32]      0x01aef470: 47f0    blx     lr        StackMap [native_pc=0x1aef473] (dex_pc=0xd, native_pc_offset=0x4e, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000)          v0: in register (11)[entry 6]          v1: in register (10)[entry 5]          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef472: 6ab9    ldr     r1, [r7, #40]      0x01aef474: 465a    mov     r2, r11      0x01aef476: 460d    mov     r5, r1      0x01aef478: 6808    ldr     r0, [r1, #0]        StackMap [native_pc=0x1aef47b] (dex_pc=0x12, native_pc_offset=0x56, dex_register_map_offset=0xf, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000)          v0: in register (11)[entry 6]          v1: in register (10)[entry 5]          v2: in register (1)[entry 4]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef47a: f8d000d8ldr.w   r0, [r0, #216]      0x01aef47e: f8d0e020ldr.w   lr, [r0, #32]      0x01aef482: 47f0    blx     lr        StackMap [native_pc=0x1aef485] (dex_pc=0x12, native_pc_offset=0x60, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000)          v0: in register (11)[entry 6]          v1: in register (10)[entry 5]          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef484: 6ab9    ldr     r1, [r7, #40]      0x01aef486: 4642    mov     r2, r8      0x01aef488: 4653    mov     r3, r10      0x01aef48a: 460d    mov     r5, r1      0x01aef48c: 6808    ldr     r0, [r1, #0]        StackMap [native_pc=0x1aef48f] (dex_pc=0x17, native_pc_offset=0x6a, dex_register_map_offset=0xf, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000)          v0: in register (11)[entry 6]          v1: in register (10)[entry 5]          v2: in register (1)[entry 4]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef48e: f8d000d0ldr.w   r0, [r0, #208]      0x01aef492: f8d0e020ldr.w   lr, [r0, #32]      0x01aef496: 47f0    blx     lr        StackMap [native_pc=0x1aef499] (dex_pc=0x17, native_pc_offset=0x74, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000)          v0: in register (11)[entry 6]          v1: in register (10)[entry 5]          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]      0x01aef498: e002    b       +4 (0x01aef4a0)      0x01aef49a: 460f    mov     r7, r1      0x01aef49c: 4690    mov     r8, r2      0x01aef49e: 2600    movs    r6, #0      0x01aef4a0: f8c78074str.w   r8, [r7, #116]      0x01aef4a4: f1b80f00cmp.w   r8, #0      0x01aef4a8: d003    beq     +6 (0x01aef4b2)      0x01aef4aa: f8d90080ldr.w   r0, [r9, #128]  ; card_table      0x01aef4ae: 09f9    lsrs    r1, r7, #7      0x01aef4b0: 5440    strb    r0, [r0, r1]      0x01aef4b2: 767e    strb    r6, [r7, #25]      0x01aef4b4: b009    add     sp, sp, #36      0x01aef4b6: e8bd8de0pop     {r5, r6, r7, r8, r10, r11, pc}      0x01aef4ba: 9104    str     r1, [sp, #16]      0x01aef4bc: 9205    str     r2, [sp, #20]      0x01aef4be: f8d9e2a8ldr.w   lr, [r9, #680]  ; pTestSuspend      0x01aef4c2: 47f0    blx     lr        StackMap [native_pc=0x1aef4c5] (dex_pc=0x0, native_pc_offset=0xa0, dex_register_map_offset=0x13, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000110000)          v4: in stack (16)[entry 7]          v5: in stack (20)[entry 8]      0x01aef4c4: 9904    ldr     r1, [sp, #16]      0x01aef4c6: 9a05    ldr     r2, [sp, #20]      0x01aef4c8: e7b9    b       -142 (0x01aef43e)      0x01aef4ca: 4601    mov     r1, r0  ;把 mValues 的 size 作为第二个参数      0x01aef4cc: 4630    mov     r0, r6  ;把 index(0) 作为第一个参数      0x01aef4ce: f8d9e2b0ldr.w   lr, [r9, #688]  ; pThrowArrayBounds      0x01aef4d2: 47f0    blx     lr ; 调用 pThrowArrayBounds(artThrowArrayBoundsFromCode)抛出 ArrayIndexOutOfBoundsException        StackMap [native_pc=0x1aef4d5] (dex_pc=0x7, native_pc_offset=0xb0, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000)          v2: in register (5)[entry 0]          v3: in register (6)[entry 1]          v4: in register (7)[entry 2]          v5: in register (8)[entry 3]
在跳转到 pThrowArrayBounds之前,准备了两个参数:r0 (index),r1 (array size)
  qpoints->pThrowArrayBounds = art_quick_throw_array_bounds;
    /*     * Called by managed code to create and deliver an ArrayIndexOutOfBoundsException. Arg1 holds     * index, arg2 holds limit.     */TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_array_bounds, artThrowArrayBoundsFromCode
看下这个宏:
.macro TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING c_name, cxx_name    .extern \cxx_nameENTRY \c_name    SETUP_SAVE_EVERYTHING_FRAME r2  @ save all registers as basis for long jump context    mov r2, r9                      @ pass Thread::Current    bl  \cxx_name                   @ \cxx_name(Thread*)END \c_name.endm
在原有参数的基础上又加了第三个参数 r2,它是 Thread* self;然后跳转到 artThrowArrayBoundsFromCode:
// Called by generated code to throw an array index out of bounds exception.extern "C" NO_RETURN void artThrowArrayBoundsFromCode(int index, int length, Thread* self)    REQUIRES_SHARED(Locks::mutator_lock_) {  ScopedQuickEntrypointChecks sqec(self);  ThrowArrayIndexOutOfBoundsException(index, length);  self->QuickDeliverException();}

7.2 ArithmeticException

看下注释:
    /*     * Called by managed code to create and deliver an ArithmeticException.     */NO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_div_zero, artThrowDivZeroFromCode
看一个例子:
Java CODE:
    public static int floorDiv(int x, int y) {        int r = x / y;        // if the signs are different and modulo not zero, round down        if ((x ^ y) < 0 && (r * y != x)) {            r--;        }        return r;    }
DEX CODE:
  24: int java.lang.Math.floorDiv(int, int) (dex_method_idx=2574)    DEX CODE:      0x0000: 9300 0203                 | div-int v0, v2, v3      0x0002: 9701 0203                 | xor-int v1, v2, v3      0x0004: 3b01 0800                 | if-gez v1, +8      0x0006: 9201 0003                 | mul-int v1, v0, v3      0x0008: 3221 0400                 | if-eq v1, v2, +4      0x000a: d800 00ff                 | add-int/lit8 v0, v0, #-1      0x000c: 0f00                      | return v0
QUICK CODE:
    CODE: (code_offset=0x005d0024 size_offset=0x005d0020 size=72)...      0x005d0024: f81f0fe0  str x0, [sp, #-16]!      0x005d0028: f90007fe  str lr, [sp, #8]      0x005d002c: 340001c2  cbz w2, #+0x38 (addr 0x5d0064)      0x005d0030: 1ac20c20  sdiv w0, w1, w2      0x005d0034: 4a020023  eor w3, w1, w2      0x005d0038: 36f80103  tbz w3, #31, #+0x20 (addr 0x5d0058)      0x005d003c: 1b007c42  mul w2, w2, w0      0x005d0040: 6b02003f  cmp w1, w2      0x005d0044: 1a9f17e1  cset w1, eq      0x005d0048: 51000402  sub w2, w0, #0x1 (1)      0x005d004c: 7100003f  cmp w1, #0x0 (0)      0x005d0050: 1a821003  csel w3, w0, w2, ne      0x005d0054: aa0303e0  mov x0, x3      0x005d0058: f94007fe  ldr lr, [sp, #8]      0x005d005c: 910043ff  add sp, sp, #0x10 (16)      0x005d0060: d65f03c0  ret      0x005d0064: f942767e  ldr lr, [tr, #1256] ; pThrowDivZero      0x005d0068: d63f03c0  blr lr        StackMap [native_pc=0x5d006c] (dex_pc=0x0, native_pc_offset=0x48, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b000000)          v2: in register (1)   [entry 0]          v3: in register (2)   [entry 1]
可以看到在除法运算中安插了除数为0的检测。Interpreter模式下的检测不再介绍。

7.3 StringIndexOutOfBoundsException

与上面的 ArrayIndexOutOfBoundsException 类似:
    /*     * Called by managed code to create and deliver a StringIndexOutOfBoundsException     * as if thrown from a call to String.charAt(). Arg1 holds index, arg2 holds limit.     */TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_string_bounds, artThrowStringBoundsFromCode

9. Throw & Catch的实现

throw-catch-finally:ART异常处理机制(4) - throw & catch & finally实现