- Loading...
Patch name: tailc.patch
Prototype
Work in progress.
Tail call optimization guarantees that a series of tail calls executes in bounded stack space. No StackOverflow exception happens.
After applying the tail call patches Hotspot supports tail call optimization for all method invocations that are marked as tail call. The optimization is performed by the VM if the invocation is marked as a 'tail call' and the necessary conditions for tail calls are met. That is the VM supports hard tail calls. If the VM can not guarantee that the innvocation will be optimized it throws an exception. Note that the VM will not recognize normal calls that could be tail call optimized and perform the optimization on them.
To enable tail call optimization the VM is started with the flag -XX:+TailCalls.
java -XX:+TailCalls TailCallExample
Tail call optimization is implemented by replacing the caller's frame with the callee's frame if the java security mechanism allows it. If removing the caller's frame is not possible, the VM proceeds in two possible ways depending on whether the 'TailCallStackCompressions' flag is set or not.
To mark a method call as tail call the invocation bytecode instruction has to be prefixed with a 'wide' (196) bytecode.
public static int tailcaller(int); Code: 0: iload_0 1: ifne 8 4: iload_0 5: iconst_1 6: iadd 7: ireturn 8: iload_0 9: iconst_1 10: isub 11: wide 12: invokestatic #2; //Method tailcaller:(I)I 15: ireturn }
After applying the langtools-tailcall-goto.patch (to the langtools directory) we can mark calls as tail calls using the 'goto' prefix before a method invocation in java. Above bytecode example is generated by javac from following java code.
public class Simple {
  public static int tailcaller(int x) {
    if (x==0) return x+1;
    return goto tailcaller(x-1);
  }
}
There are two bytecode verifiers in the jdk. The old type-inference verifier and the new type-checking verifier. The type-checking verifier is used if the class file format major version is >= 50. To use this verifier, a valid stackmaptable attribute has to be emitted. Tail calls are currently only supported if the new verifier is used. Hence, to use tail calls the generated class file must have a version number >= 50.0 and must contain a valid stackmaptable attribute.
The default behaviour of the jvm is to fall back to the old verifier if the new verifier fails (e.g because the stackmaptable attribute is invalid). The jvm has an option to disable this behaviour.
java -XX:-FailOverToOldVerifier -XX:+TailCalls GeneratedClass
Some implementation details about the prototype can be found in Arnold Schwaighofer's thesis:
http://www.ssw.uni-linz.ac.at/Research/Papers/Schwaighofer09Master/
There currently exist two branches
A successful tail call removes the frame of the caller C1 and replaces it with the callee frame C2. This process can repeat an arbitrary number of times, with each successive frame C(k) performing recursive calls before it tail-calls the next frame C(k+1). Unless the thread loops infinitely, the sequence of tail-called frames ends eventually when one callee C(Max) returns to its caller P0 instead of performing a tail call. The frame P0 returned to may be called the parent, and is second-youngest frame on the stack at the point of every tail call in the sequence.
If some C(k) is interrupted and the stack is inspected in the debugger, the earlier frames C(1)..C(k-1) will not be visible unless the system has taken special care to record their historical presence for the benefit of the debugger. It might be reasonable to define debugging parameters which request the JVM to record (at extra cost) the presence of some of those historical frames.
Thread stacks can also be viewed with the API call Throwable.getStackTrace.  The same considerations apply as for debugger display of frame, except that there must be a default setting in production mode (with no debugging) which is generally agreed upon.  Recording no historic frames at all is almost certainly the correct default setting.  If historic frames are recorded at all in production modes, then they would probably require special presentation in the stack trace.  It is unlikely that historic frames will be retained or displayed in any initial version of tail call support.
A very specialized version of stack walking is performed for a limited number of system methods like Class.forName, which are sensitive to their immediate caller, and behave as if that caller (as a Class object) were passed as an extra invisible parameter.  In recent versions of the JDK, these methods are clearly marked using the internal annotation @CallerSensitive, and can be reliably processed as special cases.  An attempt to link to one of these methods from a tail call site should fail with a linkage error. 
The API call AccessController.getContext also gives a view derived from the control stack.  This view cannot omit the effects of historical frames, if the presence of those frames would affect access control checks.  The information content of an access control context is a set of protection domains, each one derived from the class loader of a method in a pending frame.  (There are other details, such as user-supplied protection domains and stack boundaries, but they are irrelevant to this discussion.)  The set of protection domains is unordered and indifferent to repetitions. (That is, it is really a set.)  The most important property of this set is that, if a stack frame is contributing to the current computation, the protection domain from that stack frame must be present in the set.  Crucially, this must be true even of the historical frames C(1)..C(k-1).
Suppose the protection domains for the various frames C(1)..C(k) are PD(1)..PD(k). Each new call to C(k+1) must be inspected to see if it contributes a new protection domain PD(k+1) to the existing set {PD(1), ... PD(k)}. One simple way to do this is to detect when PD(k) and PD(k+1) differ; if they do not differ, then normal stack walking will detect the protection domain even after C(k) has been removed from the stack. Other optimizations are possible, with the normal application of "fast path" and "slow path" techniques. The collected set must be maintained somewhere. A likely place would be an extra hidden frame between the parent frame P0 and the first tail-called child C(2).
Further discussion of the problem of access control may be found in a paper by Clements and Felleisen, linked here: http://mail.openjdk.java.net/pipermail/mlvm-dev/2013-July/005417.html