Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Excerpt

JSR 292 introduces a flexible invokedynamic instruction which is bound to a user-defined graph of method handles.

Sub-topics

Table of Contents
maxLevel2
typelist
Children Display
depth99
sorttitle
excerpttrue
excerptTypesimple

Method handle fundamentals

Abstractly, a method handle is simply a type and some behavior that conforms to that type. As befits an object-oriented system, the behavior may include data.

Concretely, a method handle can refer to any JVM method, field, or constructor, or else it can be a transform of any previously specified method handle. Transforms include partial application (binding), filtering, and various forms of argument shuffling.

The method handle's type is expressed as a sequence of zero or more parameter types, and an optional return type (or the non-type void). Concretely, this is a MethodType reference, and can be extracted from any method handle using MethodHandle.type.

The behavior is what happens when the method handle is invoked, using the method MethodHandle.invokeExact. The special capability of method handles is that invokeExact accepts any number of any type of arguments, and can return any type or void. A regular invokevirtual instruction performs this. (It is rewritten secretly to invokehandle, as discussed below, but this can be ignored except by HotSpot implementors.)

Uniquely to method handles, the invokevirtual instruction can specify any structurally valid type signature, and the call site will link. Technically, we say that invokeExact is signature polymorphic. Practically speaking, when linking such a call site, the JVM must be ready to deal with any type signature, which means it will have to generate adapters of various sorts. From the user's point of view, a method handle is a magic thing which can wrap and/or invoke any method, of any type.

Concretely, the behavior of a method handle depends on a object called a LambdaForm, which is a low-level description of step-by-step operations. A method handle's lambda form is stored in its form field, just as its type is stored in its type field.

A method handle's lambda form may ignore the method handle completely and do something context-independent, like throw an exception or return zero. More generally, it can consult the method handle for information. For example, it can examine the method handle's return type and convert some value to that type before returning it.

More interestingly, if a method handle's class is a subclass which contains additional data fields, the lambda form can refer to those fields as it executes.

Since method handles express behavior more than state, their fields are typically immutable. But, method handles can easily be bound to arbitrary Java objects, producing closures.

The "basic type" system

In order to implement signature polymorphism more simply, method handles internally operate in terms of basic types. A basic type is a JVM type in which many inconvenient distinctions have been "erased", so that the remaining distinctions (such as reference vs. primitive and int vs. long) can be attended to.

For starters, in the basic type system, all 32-bit types except float are erased to simple int. If a byte value is required somewhere, it must be masked down from a full int. Thus, there are only four primitive types to worry about.

Under basic typing rules, all reference types are represented by java.lang.Object. Thus, there are a total of five basic types, represented by their JVM signature characters: L, I, J, F, D. To these we add V for the non-type void.

In the bulk of Java code, the full type system is in force. In order to name reference types, a system of class loaders and type constraints must be consulted and honored. From perspective of the JSR 292 runtime, this type system is a complex mix of names and scopes. Inside the runtime, using basic types there are no names to worry about, except Object and other types on the boot class path.

If a reference of a narrower type is required somewhere, an explicit checkcast must be issued before the reference is used. In fact, the checkcast is in general a call to Class.cast, with the specialized type being a constant Class reference rather than a symbolic reference name.

Normally, all extra conversions (such as int to byte and Object to a named reference type) disappear in the optimizer, which keeps track of full type information from context.

Lambda form basics

In brief, a lambda form is a classic lambda expression with zero or more formal parameters, plus zero or more body expressions. The types of parameters and expression values are drawn from the basic type system.

Each expression is simply the application of a method handle to zero or more arguments. Each argument is either a constant value or a previously specified parameter or expression value.

When a lambda form is used as a method handle behavior, the first parameter (a0) is always the method handle itself. (But there are other uses for lambda forms.)

When a method handle is invoked, after any initial type checking, the JVM executes the lambda form of the method handle to complete the method handle invocation. This leads to some bootstrapping challenges, since the lambda form executes by evaluating additional method handle invocations.

Lambda forms are described in detail elsewhere: http://wiki.jvmlangsummit.com/Lambda_Forms:_IR_for_Method_HandlesImage Added

Lambda forms will be introduced by example as various behaviors are described.

Lambda form optimization

There is one more indirection in lambda form execution which allows the system to optimize itself: A lambda form has a field called vmentry which (at long last) provides a Method* pointer for the JVM to jump into, in order to evaluate the lambda form.

(Note: Since Java cannot directly represent JVM metadata pointers, this vmentry is actually of type MemberName, which is a low-level wrapper for a Method*. So there is one more indirection after all, to hide the metadata.)

When a lambda form is first created, this vmentry pointer is initialized to a method called the lambda form interpreter, which can execute any lambda form. (Actually it has a thin wrapper which is specialized to the arity and basic types of the arguments.) The lambda form interpreter is very simple and slow. After it executes a given lambda form a few dozen times, the interpreter fetches or generates bytecode for the lambda form, which is customized (at least partially) to the lambda form body. In the steady state, all "hot" method handles and their "hot" lambda forms have bytecode generated, and eventually JIT-compiled.

Thus, in the steady state, a hot method handle is executed without the lambda form interpreter. The low-level JVM steps are as follows:

  • Fetch MethodHandle.form.
  • Fetch LambdaForm.vmentry.
  • Fetch MemberName.vmtarget, a hidden Method* pointer.
  • Fetch Method::from_compiled_entry.
  • Jump to optimized code.

As noted elsewhere, if the method handle (or if the lambda form or the member name) is a compile-time constant, all the usual inlining can be done.

Invokedynamic

As defined in the JVMS, invokedynamic consists of a name, a method type signature, and bootstrap specifier.

...

Second, it allows loose typing of its arguments and return value, according to the basic type scheme used in the JSR 292 runtime. Under basic typing rules, all reference types are represented by java.lang.Object. If a reference of a narrower type is required somewhere, an explicit checkcast must be issued before the reference is used. Also, in the basic type system, all 32-bit types are erased to simple int. Thus if a byte value is required somewhere, it must be masked down from a full int. Normally, these extra conversions disappear in the optimizer.(See above.)

Method handle invocation

Internally to HotSpot (in rewriter.cpp) method handle invocations are rewritten to use a special instruction called invokehandle. This instruction in many ways is parallel to invokedynamic. It resolves to an adapter method pointer and an appendix. The appendix (if not null) is pushed after the explicit arguments to invoke or invokeExact.

...