Direct method handles

A direct method handle represents a particular named method, field, or constructor, with no transformations.

Direct method handles are obtained in one of two ways, from the constant pool (via a CONSTANT_MethodHandle CP entry), or from a corresponding factory method on MethodHandles.Lookup. Every kind of DMH can be obtained either way.

As defined by the JVMS, a CONSTANT_MethodHandle CP entry can refer to a CONSTANT_Methodref as if it were to be invoked via an instruction of type invokestatic, invokevirtual, invokeinterface, or invokespecial.

The invokespecial mode cannot refer to a constructor, because this would be an unsafe use of that constructor. However, a special kind of CONSTANT_MethodHandle entry can refer to a new instruction followed by a dup and an invokespecial of the constructor.

A CONSTANT_MethodHandle CP entry can also refer to a CONSTANT_Fieldref as if it were to be invoked via an instruction of type getfield, getstatic, putfield, or putstatic.

Thus there are nine different "reference kinds" that a CONSTANT_MethodHandle constant may have, and these are the nine different kinds of direct method handle. Each kind of method handle encapsulates an access-checked, resolved symbolic reference to a method or a field. The capabilities and semantics of the method handle are identical to those of the corresponding JVM instructions, as applied to the underlying symbolic reference.

The class MethodHandles.Lookup provides a reflective API for constructing the same kinds of direct method handles on the fly, based on programmatically supplied symbolic reference data. Each lookup operation for a symbolic reference requires four parameters:

the class containing the member, as a Class mirror
the name of the member, as a String
the type signature of the member, as a MethodType (or a Class if a field)
the scope from which the lookup is performed, as a Lookup object

The last argument provides the needed starting point for the lookup, and is used to verify that the caller has permission to access the desired member.

For example, the public method String.valueOf(int) may be accessed from any lookup object using String.class as a scope, "valueOf" as a name, and MethodType.methodType(String.class,int.class) as a type signature. (The first component of the method type is the return type, and subsequent types are arguments, omitting the "self" type.)

import static java.lang.invoke.MethodHandles.*;
import static java.lang.invoke.MethodType.*;
MethodType MT_valueOf = methodType(String.class, int.class);
MethodHandle MH_valueOf = lookup().findStatic(String.class, "valueOf", MT_valueOf);

// CP equiv = MethodHandle:(ref_kind:6, Methodref:(Class:String,
//      NameandType:(UTF8:"valueOf", UTF8:"(I)Ljava/lang/String;")))

// behavioral equivalence to a CONSTANT_Methodref:
String testval_from_CP = /*invokestatic*/ String.valueOf(42);
String testval_from_MH = (String) MH_valueOf.invokeExact(42);
assert(testval_from_CP.equals(testval_from_MH));

The private field String.hashCode may be accessed from a lookup object with private capability on String using String.class as a scope, "hashCode" as a name, and int.class as a field type. This lookup will fail with a linkage error unless the lookup object is truly derived (in a trustable manner) from the String class.

The last time anyone looked, the String class didn't have code to create any such lookup object. More realistically, if a class wants to access one of its own private fields via a direct method handle, it can create a lookup object on itself (this is always allowed) and then introspect into itself to obtain a method handle on that field, either for getting or setting. It would be up to that class to take care not to let this method handle escape to untrusted code.

Structure of a direct method handle

Every direct method handle is an instance of the non-public class DirectMethodHandle or a subclass. Static fields are represented by DirectMethodHandle.StaticAccessor, other fields are represented by DirectMethodHandle.Accessor, constructor references are represented by DirectMethodHandle.Constructor, references via invokespecial are represented by DirectMethodHandle.Special, and references to other methods are represented by DirectMethodHandle itself.

A DMH is an immutable object, in that all of its fields are final.

The private field DirectMethodHandle.member contains a constant MemberName, which is a low-level, unchecked symbolic reference. When a DMH is created it is given a MemberName, which must already have been resolved.

This means that the JVM has checked the reference and filled in any information needed to execute against one or more times. The information is very much like a constant pool cache entry, and consists of a JVM-level offset and/or a Klass pointer and/or a Method pointer.

A DMH for a field contains a MemberName which has been resolved to an offset, and that offset is used with a method like Unsafe.getByte or Unsafe.setInt to get or set the field value. If the field is static, the DMH contains a staticBase parameter which will be handed to the unsafe data access; otherwise, the DMH uses the leading parameter (the nominal receiver object this) as the base address.

A DMH for a constructor contains both an instanceClass field (which is handed to Unsafe.allocateInstance) and a initMethod field (which is the subject of an unchecked invokespecial operation). The code for a constructor creates a blank instance, marshals the incoming arguments, calls the init method (a JVM-level constructor method), and returns the filled-in instance.

A DMH for a regular method simply pushes the required arguments, and performs a suitable invocation (static, virtual, interface, or special) of the method referred to by the member field.

In common with all method handles, a DMH also has constant type and form fields, specifying the MethodType and the LambdaForm behavior.

Lambda form for a method call

The behavioral differences alluded to above are all controlled by the lambda form of the DMH. For every one of the nine distinct reference kinds, and for each possible method or field type, there is a lambda form of the required type and behavior.

These lambda forms are created on the fly and cached. The cache is keyed by lambda form kind and method or field type. They could also be precomputed, at least in many cases. As in other parts of the system, to reduce combinatorial explosion, the cache is keyed by basic type, in which all reference types are erased to Object.

(There are a few other stray bits used to key the cache, such as whether a field is volatile or whether a static reference is to a class that has not yet been initialized. These stray bits could be represented by conditional code in the DMH behavior, except that it seems more efficient to customize.)

The resulting lambda form code would be very type-unsafe, except that all reference values of uncertain type are explicitly check-casted. Such casts are relatively rare, since the DMH "knows" that it can only be invoked on values consistent with its advertised method-type. One place where a cast is needed is

Here is an example (taken from an actual application) of the lambda form of a DMH which performs an invokestatic call on two references and a double:

LambdaForm(a0:L,a1:L,a2:L,a3:D)=>{
    t4:L=DirectMethodHandle.internalMemberName(a0:L);
    t5:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:D,t4:L);
    t5:I}

The leading argument a0 is the DMH itself. Its member-name field is extracted into t4 and then the DMH variable a0 goes dead. The remaining arguments feed the three parameters of the static method.

The static method is invoked via a low-level, unchecked, privileged primitive called linkToStatic. The three arguments are stacked first, in the positions expected by the static method, and then the member-name t4 is appended to the argument list.

The linkToStatic method is a native function which pops off the trailing MemberName argument, consults the VM-level linkage information inside (which is of type Method* in this case), and jumps to the indicated target method.

When the target method returns, the return value (which must be an int) is captured in t5 and then returned to the invoker of the DMH.

Two strange things happen here. First, the linkToStatic method performs a tail call to its target method. While the target method executes, there may be a visible stack frame for the DMH invocation, but there will never be a stack frame for the linkToStatic method. Second, before doing the tail call, linkToStatic pops its last argument, so that the target method will not have to deal with it.

There one other strange thing about this whole scenario: The target method is never named. It is simply jumped to, indirectly, using the Method* pointer stuffed by the JVM into the MemberName, when it was resolved. Because of this, the lambda form above works equally well with (and can be shared by) any static method that takes two references and a double, and returns an int.

Edge cases for method calls

As an edge case, if the static method is in a class that is not yet initialized, a specialized lambda form is used which tweaks the class before jumping into the method, so as to be sure the class is initialized first. This case is handled by a call to DirectMethodHandle.ensureInitialized from a variation of internalMemberName which makes the extra checks:

LambdaForm(a0:L,a1:L,a2:L)=>{
    t3:L=DirectMethodHandle.internalMemberNameEnsureInit(a0:L);
    t4:I=MethodHandle.linkToStatic(a1:L,a2:L,t3:L);
    t4:I}

The code-patching hack performed by the compilers is emulated here by patching the DMH's lambda form field after the class is safely initialized, to a version that no longer calls internalMemberNameEnsureInit.

If the DMH were for an invokespecial instruction, or a devirtualized invokevirtual, the lambda form would be precisely identical to the example given above, except that the final call would be to linkToSpecial instead of linkToStatic. The only difference between the two "linker" routines is that linkToSpecial performs an extra null check on the leading parameter (which must of course be a reference).

Similarly, if the DMH were for an invokevirtual or invokeinterface instruction, the "linker" routine would be linkToVirtual or linkToInterface, respectively. The structure of the LF would be the same.

The linkToVirtual method is slightly more complicated than linkToSpecial. Instead of a Method*, it pulls a vtable index out of the MemberName argument. It then null-checks the first argument (a reference, again), pulls out its klass, and indexes into the vtable to find a Method*. After this, the sequence of events is identical to linkToSpecial or linkToStatic.

The linkToInterface method performs a similar lookup, but starts by pulling both a Klass* and an itable index out of the MemberName. Both linkToVirtual and linkToInterface are closely similar to the vtable and itable stubs defined in vtableStubs_$arch.cpp.

One might ask why the laborious distinction between invocation modes is made in the first place. The answer is simple: They use different low-level code sequences, which must be distinguished one way or another. There would be no benefit to putting a dynamic mode-switch into every DMH invocation.

Constructor calls

Here is an actual example of a lambda form for a constructor DMH:

LambdaForm(a0:L,a1:L,a2:L,a3:I)=>{
    t4:L=DirectMethodHandle.allocateInstance(a0:L);
    t5:L=DirectMethodHandle.constructorMethod(a0:L);
    t6:V=MethodHandle.linkToSpecial(t4:L,a1:L,a2:L,a3:I,t5:L);
    t4:L}

Again, after the DMH argument a0, there are two reference arguments and an int. The code is simple. First a blank instance is allocated, using a helper method DirectMethodHandle.allocateInstance which calls Unsafe.allocateInstance on the instanceClass field value of the DMH. The blank instance is saved in t4.

Next, the member-name of the constructor method is dug out of the DMH and put in t5. (As noted above, this is kept handy in the initMethod field of Constructor.) At this point the DMH variable a0 goes dead.

Next, the constructor is called, using linkToSpecial, just as if an invokespecial instruction were executed against the named constructor. (But as noted above, it is not actually named, just indirectly loaded as a Method* from the member-name in t5.)

There is no result value from the invocation, and we formally bind the result of t6, of "type" void. The value t4 now contains a fully constructed instance of the desired type, and it is returned.

It may be instructive to consider the type invariants required in order to make this constructor DMH safe to use:

The selected constructor must have been accessible to the original creator of the DMH.
The method type of the DMH must take two references plus a 32-bit value (int, short, char, byte, or boolean).
The method type of the init-method in the DMH must correspond to those basic types also.
If the init-method takes narrow, non-basic arguments (say, String or char) those must match the type of the DMH.
The instance-class of the DMH must exactly match the class of the init-method.
The instance-class of the DMH must also match the return type of the DMH.
All of these invariants are enforced by the privileged code which constructs the DMH.

Special vs. virtual calls

Note that if a DMH is created for a virtual invocation (using Lookup.findVirtual) but the selected method can be devirtualized, the DMH produced has internal behavior identical to one produced by Lookup.findSpecial (if such access were allowed). This is the same "opt-virtual" or "vfinal" optimization found in the JVM when working with virtual calls.

In order to distinguish between DMHs produced via the two Lookup calls, the DirectMethodHandle.Special class is used to represent the results of all calls to Lookup.findSpecial (or the corresponding invokespecial CP constants). Thus, a devirtualized method can be represented equivalently by either of the two classes, and they are behaviorally identical. The class difference allowed reflection (the MethodHandleInfo API) to disambiguate in such a case between the two possible Lookup calls.

Field references

Here is an example of a lambda form for DMH which performs an non-volatile getfield of a long-valued field.

LambdaForm(a0:L,a1:L)=>{
    t2:J=DirectMethodHandle.fieldOffset(a0:L);
    t3:L=DirectMethodHandle.checkBase(a1:L);
    t4:J=Unsafe.getLong((sun.misc.Unsafe@6a9a4f8e),t3:L,t2:J);
    t4:J}

As before, a0 is the DMH, which is of type Accessor in this case. The only other argument, a1, is the receiver object, whose field is being loaded.

The unsafe field offset is extracted from the DMH. Perhaps surprisingly, this is the last thing the DMH contributes to the operation. This corresponds to the operation of CP caches for getfield, which also contribute just a simple offset.

Next, the receiver object is checked for nullness, calling the subroutine checkBase. Finally, the desired value is extracted using Unsafe.getLong, and returned.

Note that the capability value for the Unsafe API is embedded into the lambda form as a constant operand to Unsafe.getLong. This is analogous to privileged code which performs unsafe operations against a private static final reference to Unsafe.

Direct method handle optimization

Because direct method handles are immutable, the compilers can constant-fold the fields of a DMH whenever that DMH is itself a constant.

The offsets or member-name references inside the DMH are processed by the compiler into code which is similar to that produced by the equivalent hard-coded instructions operating on symbolic references.

For example, in C2, linkToStatic and the other "linkers" are special-cased in CallGenerator::for_method_handle_inline. Given a known invocation mode and a constant member-name, the method call can be inlined as readily as a normal (symbolically specified) method call.

Unsafe getters and setters are special-cased in LibraryCallKit::inline_unsafe_access. When the base address of the reference is a compile-time constant type, the actual field can be recovered from the numerical offset handed to the getter or setter.

Non-constant invocation

For non-constant DMH invocations, the compiler must emit a call to the linker method. However, this is not terribly slow, since linkToStatic and its fellows have tightly hand-coded fast paths for compiled calls.

Unlike in the interpreter calling sequence, trailing MemberName argument cannot simply be popped from a stack. But there is another trick available. Compiled calling sequences are assumed to allow trailing arguments to be popped without data motion. This allows the trailing member-name to be consulted and then ignored during the tail-call of the real target method.

In other words, where the interpreted version of linkToVirtual or its fellows would have to pop the trailing argument to pick up the MemberName, the compiled version simply has to pick it up from the Nth compiled argument position, and then treat that argument position as irrelevant in the subsequent tail-call.

(The trailing argument ignorability invariant for compiled calls is verified in SharedRuntime::check_member_name_argument_is_last_argument. See further discussion in Linker methods for direct method handles.)

For any given basic-type, the JVM spins (as needed) a compiled version of linkToStatic (etc.) that loads the trailing MemberName from the appropriate register or stack location.

For interpreted calls, there is only one version of each of the four "linker" methods. This works because there is only one way to pop the trailing argument off the stack, in the interpreter.