JVM Internal

来源:互联网 发布:中文翻译拼音软件 编辑:程序博客网 时间:2024/06/05 10:47

从线程谈起

  JVM中,线程与系统中的native线程是存在一个直接映射的。

 In the Hotspot JVM there is a direct mapping between a Java Thread and a native operating system Thread. After preparing all of the state for a Java thread such as thread-local storage, allocation buffers, synchronization objects, stacks and the program counter, the native thread is created.native thread is reclaimed once the Java thread terminates.Once the native thread has initialized it invokes the run() method in the Java thread. When the run() method returns, uncaught exceptions are handled, then the native thread confirms if the JVM needs to be terminated as a result of the thread terminating (i.e. is it the last non-deamon thread).


JVM System Threads

1 虚拟机线程: 需要JVM达到safe-point,垃圾回收,"stop-the-world" garbage collections, thread stack dumps, thread suspension and biased locking revocation.


2   Periodic task thread

This thread is responsible for timer events (i.e. interrupts) that are used to schedule execution of periodic operations
3 GC threads
These threads support the different types of garbage collection activities that occur in the JVM
4 Compiler threads
These threads compile byte code to native code at runtime
5 Signal dispatcher thread
This thread receives signals sent to the JVM process and handle them inside the JVM by calling the appropriate JVM methods.

每个线程都有PC、thread-local、栈帧、局部变量表,操作栈,动态连接

1 native方法是没有程序计数器的

Address of the current instruction (or opcode) unless it is native. If the current method is native then the PC is undefined.

2 每个线程都有自己的栈,栈中存放的是方法的帧

Each thread has its own stack that holds a frame for each method executing on that thread. 


3 Native Stack 只有支持native方法的线程,才有native栈,如JNI方法调用

当本地方法回调java方法时,会在thread的栈中创建一个帧

If a JVM has been implemented using a C-linkage model for Java Native Invocation (JNI) then the native stack will be a C stack.  A native method can typically (depending on the JVM implementation) call back into the JVM and invoke a Java method. Such a native to Java invocation will occur on the stack (normal Java stack); the thread will leave the native stack and create a new frame on the stack (normal Java stack).


4 stack限制,如果要求的栈太大,那就是栈溢出。如果创建帧时没内存,那就是oom

If a thread requires a larger stack than allowed a StackOverflowError is thrown. If a thread requires a new frame and there isn’t enough memory to allocate it then an OutOfMemoryError is thrown.


5 帧,方法被调用时,会在栈顶创建一个帧。

Each frame contains:

  • 5.1 Local variable array
  • 局部变量表,including a reference to this, all method parameters and other locally defined variables. 
  • 局部变量包括int,long,float,double,boolean,char,short,byte,reference,returnAddress

  • Return value返回值

  • 5.2 Operand stack操作栈
  • The operand stack is used during the execution of byte code instructions in a similar way that general-purpose registers are used in a native CPU.pushing, popping, duplicating, swapping, or executing operations that produce or consume values.instructions that move values between the array of local variables and the operand stack are very frequent in byte code.会频繁在本地变量表和操作栈中传递values
举个栗子
int i;

Gets compiled to the following byte code:

 0:iconst_0// Push 0 to top of the operand stack 1:istore_1// Pop value from top of operand stack and store as local variable 1

5.3 动态连接
每个帧都有指向运行时常量池的指针,
Each frame contains a reference to the runtime constant pool. The reference points to the constant pool for the class of the method being executed for that frame. This reference helps to support dynamic linking.
  • c和cpp的动态连接,是将对象文件连接成一个可执行的文件或者dll。在连接过程中符号引用会被替换成actual内存地址
  • C/C++ code is typically compiled to an object file then multiple object files are linked together to product a usable artifact such as an executable or dll. During the linking phase symbolic references in each object file are replaced with an actual memory address relative to the final executable. In Java this linking phase is done dynamically at runtime.

  java class编译的时候,变量引用和方法,也是作为符号引用,存储在类的常量池的。jvm的实现可以选择何时resolve 符号引用。可以采用懒解析的方式。绑定是指被符号引用的域、方法、类被替换为直接引用的过程。该过程只发生一次。如果符号引用指向的类还没被解析,那么那个类将被加载。直接引用存储为一个offset

   When a Java class is compiled, all references to variables and methods are stored in the class's constant pool as a symbolic reference. A symbolic reference is a logical reference not a reference that actually points to a physical memory location. The JVM implementation can choose when to resolve symbolic references, this can happen when the class file is verified, after being loaded, called eager or static resolution, instead this can happen when the symbolic reference is used for the first time called lazy or late resolution.  However the JVM has to behave as if the resolution occurred when each reference is first used and throw any resolution errors at this point. Binding is the process of the field, method or class identified by the symbolic reference being replaced by a direct reference, this only happens once because the symbolic reference is completely replaced. If the symbolic reference refers to a class that has not yet been resolved then this class will be loaded. Each direct reference is stored as an offset against the storage structure associated with the runtime location of the variable or method.


线程中共享的部分

  堆:是存放类的实例和数组的地方。数组和对象不能存储在栈中,因为帧的大小在创建之后是不会变化的。帧只存储引用。
The Heap is used to allocate class instances and arrays at runtime. Arrays and objects can never be stored on the stack because a frame is not designed to change in size after it has been created. The frame only stores references that point to objects or arrays on the heap. 

  为方便垃圾回收,堆被分为年轻代,年老代和永久代

To support garbage collection the heap is divided into three sections:

  • Young Generation
    • Often split between Eden and Survivor
  • Old Generation (also called Tenured Generation)
  • Permanent Generation
垃圾回收步骤:

Typically this works as follows:

  1. New objects and arrays are created into the young generation
  2. Minor garbage collection will operate in the young generation. Objects, that are still alive, will be moved from the eden space to the survivor space.
  3. Major garbage collection, which typically causes the application threads to pause, will move objects between generations. Objects, that are still alive, will be moved from the young generation to the old (tenured) generation.
  4. The permanent generation is collected every time the old generation is collected. They are both collected when either becomes full.

Non-Heap Memory主要包括永久代中的方法区,以及字符串常量池,还有代码的cache

The non-heap memory includes:

  • Permanent Generation that contains
    • the method area
    • interned strings
  • Code Cache used for compilation and storage of methods that have been compiled to native code by the JIT compiler

JIT编译
java字节码是解释执行的,因此效率不高,JIT就是将字节码直接编译成本地code,并且存储在non-heap的内存中
Oracle Hotspot VM looks for “hot” areas of byte code that are executed regularly and compiles these to native code. 
native code is then stored in the code cache in non-heap memory. In this way the Hotspot VM tries to choose the most appropriate way to trade-off the extra time it takes to compile code verses the extra time it take to execute interpreted code.


Method area翻译成方法区不是很妥帖,因为存储的不是方法,而是每个类的信息,包括加载器的引用,运行时常量池,域的data,方法的data,方法的代码。

Method Area

The method area stores per-class information such as:

  • Classloader Reference
  • Run Time Constant Pool
    • Numeric constants
    • Field references
    • Method References
    • Attributes
  • Field data
    • Per field
      • Name
      • Type
      • Modifiers
      • Attributes
  • Method data
    • Per method
      • Name
      • Return Type
      • Parameter Types (in order)
      • Modifiers
      • Attributes
  • Method code
    • Per method
      • Bytecodes
      • Operand stack size
      • Local variable size
      • Local variable table
      • Exception table
        • Per exception handler
          • Start point
          • End point
          • PC offset for handler code
          • Constant pool index for exception class being caught

All threads share the same method area, so access to the method area data and the process of dynamic linking must bethread safe. If two threads attempt to access a field or method on a class that has not yet been loaded it must only be loaded once and both threads must not continue execution until it has been loaded.

class File structure

编译好的文件格式如下,这个在点击打开链接里面我也提到了,这里详细的讲一下

ClassFile {    u4magic;    u2minor_version;    u2major_version;  //JDK的版本号    u2constant_pool_count;    cp_infocontant_pool[constant_pool_count – 1];    u2access_flags;    u2this_class;     //providing the fully qualified name of this class    u2super_class;    u2interfaces_count;    u2interfaces[interfaces_count];    u2fields_count;    field_infofields[fields_count];    u2methods_count;    method_infomethods[methods_count];    u2attributes_count;    attribute_infoattributes[attributes_count];}


javap可以看到编译过的class文件的字节码

举个例子

package org.jvminternals;public class SimpleClass {    public void sayHello() {        System.out.println("Hello");    }}

Then you get the following output if you run:

javap -v -p -s -sysinfo -constants classes/org/jvminternals/SimpleClass.class

public class org.jvminternals.SimpleClass  SourceFile: "SimpleClass.java"  minor version: 0  major version: 51  flags: ACC_PUBLIC, ACC_SUPERConstant pool:   #1 = Methodref          #6.#17         //  java/lang/Object."<init>":()V   #2 = Fieldref           #18.#19        //  java/lang/System.out:Ljava/io/PrintStream;   #3 = String             #20            //  "Hello"   #4 = Methodref          #21.#22        //  java/io/PrintStream.println:(Ljava/lang/String;)V   #5 = Class              #23            //  org/jvminternals/SimpleClass   #6 = Class              #24            //  java/lang/Object   #7 = Utf8               <init>   #8 = Utf8               ()V   #9 = Utf8               Code  #10 = Utf8               LineNumberTable  #11 = Utf8               LocalVariableTable  #12 = Utf8               this  #13 = Utf8               Lorg/jvminternals/SimpleClass;  #14 = Utf8               sayHello  #15 = Utf8               SourceFile  #16 = Utf8               SimpleClass.java  #17 = NameAndType        #7:#8          //  "<init>":()V  #18 = Class              #25            //  java/lang/System  #19 = NameAndType        #26:#27        //  out:Ljava/io/PrintStream;  #20 = Utf8               Hello  #21 = Class              #28            //  java/io/PrintStream  #22 = NameAndType        #29:#30        //  println:(Ljava/lang/String;)V  #23 = Utf8               org/jvminternals/SimpleClass  #24 = Utf8               java/lang/Object  #25 = Utf8               java/lang/System  #26 = Utf8               out  #27 = Utf8               Ljava/io/PrintStream;  #28 = Utf8               java/io/PrintStream  #29 = Utf8               println  #30 = Utf8               (Ljava/lang/String;)V{  public org.jvminternals.SimpleClass();    Signature: ()V    flags: ACC_PUBLIC    Code:      stack=1, locals=1, args_size=1        0: aload_0        1: invokespecial #1    // Method java/lang/Object."<init>":()V        4: return      LineNumberTable:        line 3: 0      LocalVariableTable:        Start  Length  Slot  Name   Signature          0      5      0    this   Lorg/jvminternals/SimpleClass;  public void sayHello();    Signature: ()V    flags: ACC_PUBLIC    Code:      stack=2, locals=1, args_size=1        0: getstatic      #2    // Field java/lang/System.out:Ljava/io/PrintStream;        3: ldc            #3    // String "Hello"        5: invokevirtual  #4    // Method java/io/PrintStream.println:(Ljava/lang/String;)V        8: return      LineNumberTable:        line 6: 0        line 7: 8      LocalVariableTable:        Start  Length  Slot  Name   Signature          0      9      0    this   Lorg/jvminternals/SimpleClass;}

一共三个内容:three main sections the constant pool, the constructor and the sayHello method.
  • Methods – each containing four areas:
    • signature and access flags
    • byte code
    • LineNumberTable – this provides information to a debugger to indicate which line corresponds to which byte code instruction, for exampleline 6 in the Java code corresponds to byte code 0 in the sayHello method and line 7 corresponds to byte code 8.
    • LocalVariableTable – this lists all local variables provided in the frame, in both examples the only local variable is this.
aload_0
This opcode is one of a group of opcodes with the format aload_<n>. They all load an object reference into the operand stack. The <n> refers to the location in the local variable array that is being accessed but can only be 0, 1, 2 or 3. There are other similar opcodes for loading values that are not an object reference iload_<n>lload_<n>,float_<n> and dload_<n> where i is for int, l is for long, f is for float and d is for double. Local variables with an index higher than 3 can be loaded using iloadlloadfloatdload and aload. These opcodes all take a single operand that specifies the index of local variable to load.
ldc
This opcode is used to push a constant from the run time constant pool into the operand stack.
getstatic
This opcode is used to push a static value from a static field listed in the run time constant pool into the operand stack.
invokespecial, invokevirtual
These opcodes are in a group of opcodes that invoke methods these are invokedynamicinvokeinterfaceinvokespecialinvokestaticinvokevirtual. In this class file invokespecial and invokevirutal are both used,the difference between these is that invokevirutal invokes a method based on the class of the object. The invokespecial instruction is used to invoke instance initialization methods as well as private methods and methods of a superclass of the current class.
return
This opcode is in a group of opcodes ireturn,lreturnfreturndreturnareturn and return. Each of these opcodes are a typed return statement that returns a different type where i is for int, l is for long, f is for float, d is for double and a is for an object reference. The opcode with no leading type letter return only returns void.

参考点击打开链接
To be continued...
0 0
原创粉丝点击