Publications of the Johannes Kepler University

In close collaboration with Sun Microsystems, several research projects of the Institute for System Software at the Johannes Kepler University Linz (JKU) are based on the Java HotSpot™ VM. Parts of the research are already included in the product version of the VM. This page lists all publications that describe the research projects. Some of them might be of general interest because they describe the current production version, especially of the client compiler. The publications of the projects are ordered by relevance, i.e. the most complete and most recent publications are first.

SSA Form and Register Allocation for the Client Compiler

We changed the high-level intermediate representation of the client compiler to use static single assignment (SSA) form, which simplifies global optimizations. Additionally, we implemented a global register allocator that uses the linear scan algorithm. This work is part of the production version since Java 6.

Thomas Kotzmann, Christian Wimmer, Hanspeter Mössenböck, Thomas Rodriguez, Kenneth Russell, David Cox: Design of the Java HotSpot™ Client Compiler for Java 6. In ACM Transactions on Architecture and Code Optimization, volume 5, issue 1, article 7. ACM Press, 2008. doi:10.1145/1369396.1370017
Explains the structure of the client compiler, its intermediate representations, and the optimization algorithms. It describes the client compiler of the Java HotSpot™ VM of Java 6. Java 7 and OpenJDK contain no significant changes, so all information up-to-date.

Christian Wimmer: Linear Scan Register Allocation for the Java HotSpot™ Client Compiler. Master's thesis, Institute for System Software, Johannes Kepler University Linz, 2004.
Extended description of the client compiler, with a focus on the register allocator. Read Chapter 4 for if you are interested in the details of the compiler and the intermediate representations. The term "research compiler" in this thesis refers to the product version of Java 6 (which was not released at the time of writing).

Christian Wimmer, Hanspeter Mössenböck: Optimized Interval Splitting in a Linear Scan Register Allocator. In Proceedings of the ACM/USENIX International Conference on Virtual Execution Environments, pages 132-141. ACM Press, 2005. doi:10.1145/1064979.1064998
Description of the register allocation algorithm of the current client compiler.

Hanspeter Mössenböck, Michael Pfeiffer: Linear Scan Register Allocation in the Context of SSA Form and Register Constraints. In Proceedings of the International Conference on Compiler Construction, LNCS 2304, pages 229-246. Springer-Verlag, 2002.
Describes an early version of the work, so it does not reflect the current product version.

Hanspeter Mössenböck: Adding Static Single Assignment Form and a Graph Coloring Register Allocator to the Java HotSpot™ Client Compiler. Technical Report 15, Institute for Practical Computer Science, Johannes Kepler University Linz, 2000.
Describes an early version of the work, so it does not reflect the current product version. The graph coloring register allocator was later replaced by the linear scan register allocator.

Client Compiler Visualizer

Visualization tool for the internal data structures of the Java HotSpot™ client compiler. The tool shows the high-level and the low-level intermediate representations as well as the lifetime intervals used for register allocation. Additionally, the bytecodes of the compiled methods can be shown. Both textual and graphical views are available. The tool uses information emitted by the debug version of the Java HotSpot™ VM, starting with Java 6.

Homepage of the Java HotSpot™ Client Compiler Visualizer: https://c1visualizer.dev.java.net/
The homepage contains the binary download, the source code repository, and the documentation (several bachelor theses and master's theses).

Ideal Graph Visualizer

The Java HotSpot™ server compiler uses a single intermediate representation in all compiler phases, called ideal graph. The tool saves snapshots of the graph during the compilation. It displays the graphs and provides filtering mechanisms based on customizable JavaScript code and regular expressions. High performance and sophisticated navigation possibilities enable the tool to handle large graphs with thousands of nodes. The tool will be part of the OpenJDK soon, but there is no release available yet.

Thomas Würthinger: Visualization of Program Dependence Graphs. Master's thesis, Institute for System Software, Johannes Kepler University Linz, 2007.
This thesis describes the architecture and implementation of the tool and also contains a user guide. It does not describe the final version, but is still up-to-date.

Thomas Würthinger, Christian Wimmer, Hanspeter Mössenböck: Visualization of Program Dependence Graphs. In Proceedings of the International Conference on Compiler Construction, LNCS 4959, pages 193-196. Springer-Verlag, 2008. doi:10.1007/978-3-540-78791-4_13
A short tool demonstration paper that briefly sketches the purpose and structure of the application.

Array Bounds Check Elimination for the Client Compiler

We added a fast algorithm for array bounds check elimination to the client compiler that optimizes frequently used patterns of array accesses and uses the deoptimization facilities of the Java HotSpot™ VM. This is a research project, but the algorithm could be part of a future product version.

Thomas Würthinger, Christian Wimmer, Hanspeter Mössenböck: Array Bounds Check Elimination in the Context of Deoptimization. In Science of Computer Programming, volume 74, issue 5-6, pages 279-295. Elsevier, 2009. doi:10.1016/j.scico.2009.01.002
Journal paper that describes the optimization process. It shows how deoptimization is used, compares the algorithm with the server compiler, and contains an extensive evaluation.

Thomas Würthinger, Christian Wimmer, Hanspeter Mössenböck: Array Bounds Check Elimination for the Java HotSpot™ Client Compiler. In Proceedings of the International Conference on Principles and Practice of Programming in Java, pages 125-133. ACM Press, 2007. doi:10.1145/1294325.1294343
Conference paper that describes the optimization process, with a focus on the usage of deoptimization to avoid code duplication.

Optimization of Strings

We implemented an optimization that fuses the string object with its character array that holds the actual content. New bytecodes access the characters and allocate the strings. Nevertheless, the optimization is implemented completely inside the VM. The necessary bytecode rewriting is performed by the class loader. Optimized string objects are significantly smaller than the old string and its character array. This eliminates field loads, reduces the memory pressure, and the time necessary for garbage collection. It is a research project, but could show up in a future product version because it has a high impact on the performance.

Christian Häubl, Christian Wimmer, Hanspeter Mössenböck: Optimized Strings for the Java HotSpot™ Virtual Machine. In Proceedings of the International Conference on Principles and Practice of Programming in Java, pages 105-114. ACM Press, 2008. doi:10.1145/1411732.1411747

Christian Häubl: Optimized Strings for the Java HotSpot™ Virtual Machine. Master's thesis, Institute for System Software, Johannes Kepler University Linz, 2008.
This thesis describes the complete architecture and implementation of the optimization, together with a comprehensive evaluation.

Tail Calls

Tail calls are necessary when compiling functional languages, like Scheme, to Java bytecodes. It guarantees that no stack frame is created for recursive calls and thus no stack overflow occurs. Tail calls are supported in the interpreter, the client compiler, and the server compiler. The source code is available from the Da Vinci Machine project.

Arnold Schwaighofer: Tail Call Optimization in the Java HotSpot™ VM. Master's thesis, Institute for System Software, Johannes Kepler University Linz, 2009.
This thesis describes the complete architecture and implementation of the virtual machine changes, together with a comprehensive evaluation.

Coroutines

Coroutines are non-preemptive light-weight processes. Their advantage over threads is that they do not have to be synchronized because they pass control to each other explicitly and deterministically. Coroutines are therefore an elegant and efficient implementation construct for numerous algorithmic problems.

Lukas Stadler: Serializable Coroutines for the HotSpot™ Java Virtual Machine. Master's thesis, Johannes Kepler University Linz, February 2011.
Detailed description of the coroutine API and implementation.

Lukas Stadler, Thomas Würthinger, Christian Wimmer: Efficient Coroutines for the Java Platform. In Proceedings of the International Conference on Principles and Practice of Programming in Java, pages 20-28. ACM Press, 2010. doi:10.1145/1852761.1852765
Conference paper describing the basics of our approach.

Continuations

Continuations, or 'the rest of the computation', are a concept that is most often used in the context of functional and dynamic programming languages. Our implementation of continuations in the Java virtual machine uses a lazy or on-demand approach. Our system imposes zero run-time overhead as long as no activations need to be saved and restored and performs well when continuations are used.

Lukas Stadler, Christian Wimmer, Thomas Würthinger, Hanspeter Mössenböck, John Rose: Lazy Continuations for Java Virtual Machines. In Proceedings of the International Conference on Principles and Practice of Programming in Java, pages 143-152. ACM Press, 2009. doi:10.1145/1596655.1596679
Conference paper describing the basics of our approach.

Escape Analysis for the Client Compiler

In this research project, we implemented a fast algorithm for escape analysis. It detects objects that are accessible only by a single method or thread. Its results are used to replace fields of objects with scalar variables, to allocate objects on the stack instead of the heap, and to remove synchronization. The produced machine code is smaller and executes faster because fewer objects are allocated on the heap and the garbage collector runs less frequently. Deoptimization is used to handle dynamic class loading. There are currently no plans to integrate this work into the product version, however it influenced the implementation of escape analysis for the server compiler.

Thomas Kotzmann: Escape Analysis in the Context of Dynamic Compilation and Deoptimization. PhD thesis, Institute for System Software, Johannes Kepler University Linz, 2005.
Describes the details of all algorithms and contains an extensive evaluation as well as a survey of related work.

Thomas Kotzmann, Hanspeter Mössenböck: Run-Time Support for Optimizations Based on Escape Analysis. In Proceedings of the International Symposium on Code Generation and Optimization, pages 49-60. IEEE Computer Society, 2007. doi:10.1109/CGO.2007.34
Conference paper that describes the support necessary in the runtime system, the garbage collector, and the deoptimization framework.

Thomas Kotzmann, Hanspeter Mössenböck: Escape Analysis in the Context of Dynamic Compilation and Deoptimization. In Proceedings of the ACM/USENIX International Conference on Virtual Execution Environments, pages 111-120. ACM Press, 2005. doi:10.1145/1064979.1064996
Conference paper that describes the core algorithm for escape analysis that is integrated into the client compiler.

Automatic Object Inlining

We designed a feedback-directed optimization system for object inlining and array inlining that utilizes the just-in-time compiler and the garbage collector. Object inlining reduces the costs of field accesses by combining referenced objects with their referencing object. The order of objects on the heap is changed by the garbage collector so that they are placed next to each other. Then their offset is fixed, i.e. the objects are colocated. This allows field loads to be replaced by address arithmetic using the just-in-time compiler. Array inlining expands the concepts of object inlining to arrays, which are frequently used for the implementation of dynamic data structures. There are currently no plans to integrate this work into the product version.

Christian Wimmer, Hanspeter Mössenböck: Automatic Feedback-Directed Object Fusing. Accepted for publication in ACM Transactions on Architecture and Code Optimization. ACM Press, 2010.
Summary paper that covers all parts of the optimization, but does not describe the algorithmic details. Read this if you want to get familiar with the topic.

Christian Wimmer: Automatic Object Inlining in a Java Virtual Machine. PhD thesis, Institute for System Software, Johannes Kepler University Linz, 2008.
Describes the details of all algorithms and contains an extensive evaluation as well as a survey of related work. Read this if you are interested in the implementation details.

Christian Wimmer, Hanspeter Mössenböck: Automatic Array Inlining in Java Virtual Machines. In Proceedings of the International Symposium on Code Generation and Optimization, pages 14-23. ACM Press, 2008. doi:10.1145/1356058.1356061
Conference paper that covers only the array inlining part of the optimization system.

Christian Wimmer, Hanspeter Mössenböck: Automatic Feedback-Directed Object Inlining in the Java HotSpot™ Virtual Machine. In Proceedings of the ACM/USENIX International Conference on Virtual Execution Environments, pages 12-21. ACM Press, 2007. doi:10.1145/1254810.1254813
Conference paper that covers only the object inlining part of the optimization system.

Christian Wimmer, Hanspeter Mössenböck: Automatic Object Colocation Based on Read Barriers. In Proceedings of the Joint Modular Languages Conference, LNCS 4228, pages 326-345. Springer-Verlag, 2006. doi:10.1007/11860990_20
Conference paper that covers only the object colocation part. It describes an early version of the work, so it does not fully reflect the final version.

Child pages

Publications JKU