Performance techniques used in the Hotspot JVM
What code shapes does the JVM optimize best? Here is a list.
Knowing these optimizations may help language implementors generate bytecodes that run faster. Basic information about bytecodes is in Chapter 7 of the JVM Spec..
- The server compiler likes a loop with an int counter (int i = 0), a constant stride (i++), and loop-invariant limit (i <= n).
- Loops over arrays work especially well when the compiler can relate the counter limit to the length of the array(s).
- For long loops over arrays, the majority of iterations are free of individual range checks.
- Loops are typically peeled by one iteration, to "shake out" tests which are loop invariant but execute only on a non-zero tripcount. Null checks are the key example.
- If a loop contains a call, it is best if that call is inlined, so that loop can be optimized as a whole.
- A loop can have multiple exits. Any deoptimization point counts as a loop exit.
- If your loop has a rare exceptional condition, consider exiting to another (slower) loop when it happens.
Profiling is performed at the bytecode level in the interpreter and tier one compiler. The compiler leans heavily on profile data to motivate optimistic optimizations.