Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For Sumatra targets, we wanted to experiment with an infrastructure for supporting deoptimization, that is, transferring execution from the compiled code running on the GPU back to the equivalent bytecodes being run through the interpreter on the CPU.  (See also this definition on HotspotOverviewSome reasons for thisdeoptimization on the GPU are:

  • When we compile for the GPU, similar to when we compile for the CPU, we can get more optimized code if the compiler can make compile-time assumptions about what code paths will be taken.   When these compile-time assumptions are violated, we can "trap" to the interpreter (this relies on the fact that the interpreter can handle anything).  In addition, we have a way of handling as a way of handing certain hopefully rare events, such as throwing exceptions back to the CPU, which might be difficult to implement in completely from the GPU language(This relies on the fact that the interpreter can handle anything).  If profiling If statistics shows that such events are not actually "rare", we can
    • decide that in the future this particular lambda is
    probably
    • not a good candidate for offload.
    • or in some cases we might be able to recompile and generate new code for the GPU.
  • compiled code running on the GPU might get to a point where it needs the CPU to do something before the GPU can make further progress.  For example, if we are supporting heap allocation on the GPU, we could get to a point where we cannot allocate any new object until a GC happens.  If the target does not have an easy way to spin and wait for the CPU to do the GC, one way to support this is to deoptimize.  The interpreter will let the GC happen and then finish the allocation and continue executing bytecodes from that point.

...

  • the point of the deoptimization, including finishing the allocation.