Compressed oops in the Hotspot JVM
What's an oop, and why should they be compressed?
An "oop" in HotSpot parlance is a managed pointer to an object. It is normally the same size as a native machine pointer, which means 64 bits on an LP64 system. On an ILP32 system, there is a maximum heap size of 4Gb or so, which is not enough for many applications. On an LP64 system, though, the heap of any given run is almost twice as big as the corresponding IPL32 system (assuming the run fits both modes). This is due to the expanded size of managed pointers. Memory is pretty cheap, but these days bandwidth and cache is in short supply, so doubling the size of the heap, just to get over the 4Gb limit, is painful.
(Additionally, on x86 chips, the ILP32 mode provides half the usable registers that the LP64 mode does. SPARC is not affected this way; RISC chips start out with lots of registers and just widen them for LP64 mode.)
Compressed oops represent managed pointers (in many but not all places in the JVM) as 32-bit values which must be scaled by a factor of 8 and added to a 64-bit base address to find the object they refer to. This allows applications to address up to four billion objects (not bytes), or a heap size of up to about 32Gb. At the same time, data structure compactness is competitive with ILP32 mode.
We use the term decode to express the operation by which a 32-bit compressed oop is converted into a 64-bit native address into the managed heap. The inverse operation is encoding.
Which oops are compressed?
In an ILP32-mode JVM, or if the UseCompressedOops flag is turned off in LP64 mode, all oops are the native machine word size.
If UseCompressedOops is true, the following oops in the heap will be compressed:
- the klass field of every object
- every instance field
- every element of an oop array (objArray)
- in a constant pool... (what is the rule here??)
- (what else is compressed??)
The following oops in the heap are never compressed:
- type profile information (methodDataOops, "MDO's")
- (what else is native??)
In the interpreter, oops are never compressed. These include JVM locals and stack elements, outgoing call arguments, and return values. The interpreter eagerly decodes oops loaded from the heap, and encodes them before storing them to the heap.
Likewise, method calling sequences, either interpreted or compiled, do not deal with compressed oops.
In compiled code, oops are compressed or not according to the outcome of various optimizations. Optimized code may succeed in moving a compressed oop from one location in the managed heap to another without ever decoding it. Likewise, if the chip (i.e., x86) supports addressing modes which can be used for the decode operation, compressed oops might not be decoded even if they are used to address object fields or array elements.
Therefore, the following structures in compiled code can refer to either compressed oops or native heap addresses:
- register or spill slot contents
- oop maps (GC maps)
- debugging information (linked to oop maps)
- oops embedded directly in machine code (on non-RISC chips like x86 which allow this)
- nmethod constant section entries (including those used by relocations affecting machine code)
In the C++ code of the HotSpot JVM, the distinction between compressed and native oops is reflected in the C++ static type system. In general, oops are often uncompressed. In particular C++ member functions operate as usual on receivers (this) represented by native machine words. A few functions in the JVM are overloaded to handle either compressed or native oops.
Important C++ values which are never compressed:
- C++ object pointers (this)
- handles to managed pointers (type Handle, etc.)
- JNI handles (type jobject)
The C++ code has a type called narrowOop to mark places where compressed oops are being manipulated (usually, loaded or stored).
Using addressing modes for decompression
Here is an example of an x86 instruction sequence that uses compressed oops:
! int R8; oop[] R9; // R9 is 64 bits ! oop R10 = R9[R8]; // R10 is 32 bits ! load compressed ptr from wide base ptr: movl R10, [R9 + R8<<2 + 16] ! klassOop R11 = R10._klass; // R11 is 32 bits ! void* const R12 = GetHeapBase(); ! load compressed klass ptr from compressed base ptr: movl R11, [R12 + R10<<3 + 8]
Null processing
A 32-bit zero value decodes into a 64-bit native null value. This requires an awkward special path in the decoding logic, to the point where it is profitable to statically note which compressed oops (like klass fields) are guaranteed never to be null, and use a simpler version of the full decode or encode operation.
Implicit null checks are crucial to JVM performance, in both interpreted and compiled bytecodes. A memory reference which uses a short-enough offset on a base pointer is sure to provoke a trap or signal of some sort if the base pointer is null, because the first page or so of virtual address space is not mapped.
We can sometimes use a similar trick with compressed oops, by unmapping the first page or so of the virtual addresses used by the managed heap. The idea is that, if a compressed null is ever decoded (by shifting and adding to the heap base), it can be used for a load or store operation, and the code still enjoys an implicit null check.
Object header layout
An object header consists of a native-sized mark word, a klass word, a 32-bit length word (if the object is an array), a 32-bit gap (if required by alignment rules), and then zero or more instance fields, array elements, or metadata fields. (Interesting Trivia: Klass metaobjects contain a C++ vtable immediately after the klass word.)
The gap field, if it exists, is often available to store instance fields.
If UseCompressedOops is false (and always on ILP32 systems), the mark and klass are both native machine words. For arrays, the gap is always present on LP64 systems, and only on arrays with 64-bit elements on ILP32 systems.
If UseCompressedOops is true, the klass is 32 bits. Non-arrays have a gap field immediately after the klass, while arrays store the length field immediately after the klass.