...
- for the workitems that finished normally, there is nothing to do
- if there are any deopted workitems, we want to run each deopting workitem thru the interpreter starting from the byte code index of the deoptimization point. Note that it is possible for different workitems to have different deoptimization "PCs". We currently re-dispatch each of the deopting workitems sequentially although other policies are clearly possible.
- Note: Getting to the interpreter from the saved hsail state goes first through some special compiled host trampoline code infrastructure that designed by Gilles Duboscq designedof the graal team. The trampoline host code takes the hsail deoptId and a pointer to the saved hsail frame as input and then immediately deoptimizes just as any host compiled code would deoptimize.
- Note: Getting to the interpreter from the saved hsail state goes first through some special compiled host trampoline code infrastructure that designed by Gilles Duboscq designedof the graal team. The trampoline host code takes the hsail deoptId and a pointer to the saved hsail frame as input and then immediately deoptimizes just as any host compiled code would deoptimize.
- for each never-ran workitem, we can just run it as a "javaCall" from the beginning of the kernel method, just making sure we pass the arguments and the appropriate workitem id for each one. Again, we We currently do this sequentially although other policies are possible. One policy that could be considered if the never-rans are contiguous is to resubmit the kernel with a sort of offset applied to each workitemId. For instance if out of an original range of 10000, we see that workitems 8000-9999 did not run, these could be run as a new range of 2000, as long as each workitem adds 8000 to its raw workitemId.
GC Considerations
Currently, the normal kernel dispatch runs in "thread in VM" mode and thus does not need to worry about moving Oops. However, each time a deopting workitem is run through the interpreter or each time a never-ran workitem is run we are back in Java mode which can cause GCs. So for each saved hsail frame, we need to know which of the saved state contain oops and make sure those locations are updated in the face of GC. The current hack strategy of copying oops into an Object array supplied by the java side will be replaced later with an oops_do type of strategy.
...