Sumatra JDK build instructions

The Sumatra JDK now produces a build that allows offloading certain JDK 8 Stream API parallel streams terminating in forEach() to HSA APU/GPUs, or the HSAIL Simulator described in a nearby wiki page. To produce the offload-enabled JDK, it is a two step process. First, build the Sumatra JDK, which is build the same way as JDK 8. Then use the Sumatra JDK you built as the JAVA_HOME when building a Graal server JVM, to make a Graal JDK that includes our HSAIL support.

To build the whole system, clone the Sumatra repository http://hg.openjdk.java.net/sumatra/sumatra-dev/, then build it following the normal JDK 8 build instructions as shown at http://hg.openjdk.java.net/jdk8/jdk8/raw-file/tip/README-builds.html.

Next, clone the Graal repository, which contains the HSAIL backend used to produce the offload kernels from the lambda used in the Stream API forEach call. See the Graal wiki here.

Use the Sumatra JDK image you built as the JAVA_HOME for building Graal. Note we have been using the server build of Graal, so that the CPU methods get compiled with the -server compiler and Graal is used only for the HSAIL compilation in this configuration. For example:

$ export JAVA_HOME=/path/to/sumatra-dev/build/linux-x86_64-normal-server-release/images/j2sdk-image/

$ ./mx.sh --vmbuild product --vm server build

This builds a Graal enabled JDK that you can use to run HSAIL kernels either in the Graal mx system or standalone. To see a simple example of an HSAIL kernel running in mx, try running mx unittest as shown:

$ ./mx.sh --vm server unittest -XX:+TraceGPUInteraction -XX:+GPUOffload -G:Log=CodeGen hsail.test.IntAddTest
...
[HSAIL] library is libokra_x86_64.so
[HSAIL] using _OKRA_SIM_LIB_PATH_=/tmp/okraresource.dir_2488167353114811077/libokra_x86_64.so
[GPU] registered initialization of Okra (total initialized: 2)
[CUDA] Ptx::get_execute_kernel_from_vm_address
JUnit version 4.8
.[thread:1] scope:
[thread:1] scope: GraalCompiler
    [thread:1] scope: GraalCompiler.CodeGen
    Nothing to do here
    Nothing to do here
    Nothing to do here
    version 0:95: $full : $large;
// static method HotSpotMethod<IntAddTest.run(int[], int[], int[], int)>
kernel &run (
   align 8 kernarg_u64 %_arg0,
   align 8 kernarg_u64 %_arg1,
   align 8 kernarg_u64 %_arg2
   ) {
   ld_kernarg_u64 $d0, [%_arg0];
   ld_kernarg_u64 $d1, [%_arg1];
   ld_kernarg_u64 $d2, [%_arg2];
   workitemabsid_u32 $s0, 0;

@L0:
   cmp_eq_b1_u64 $c0, $d0, 0; // null test
   cbr $c0, @L1;
@L2:
   ld_global_s32 $s1, [$d0 + 12];
   cmp_ge_b1_u32 $c0, $s0, $s1;
   cbr $c0, @L12;
@L3:
   cmp_eq_b1_u64 $c0, $d2, 0; // null test
   cbr $c0, @L4;
@L5:
   ld_global_s32 $s1, [$d2 + 12];
   cmp_ge_b1_u32 $c0, $s0, $s1;
   cbr $c0, @L11;
@L6:
   cmp_eq_b1_u64 $c0, $d1, 0; // null test
   cbr $c0, @L7;
@L8:
   ld_global_s32 $s1, [$d1 + 12];
   cmp_ge_b1_u32 $c0, $s0, $s1;
   cbr $c0, @L10;
@L9:
   cvt_s64_s32 $d3, $s0;
   mul_s64 $d3, $d3, 4;
   add_u64 $d1, $d1, $d3;
   ld_global_s32 $s1, [$d1 + 16];
   cvt_s64_s32 $d1, $s0;
   mul_s64 $d1, $d1, 4;
   add_u64 $d2, $d2, $d1;
   ld_global_s32 $s2, [$d2 + 16];
   add_s32 $s2, $s2, $s1;
   cvt_s64_s32 $d1, $s0;
   mul_s64 $d1, $d1, 4;
   add_u64 $d0, $d0, $d1;
   st_global_s32 $s2, [$d0 + 16];
   ret;
@L1:
   mov_b32 $s0, -7691;
@L13:
   ret;
@L4:
   mov_b32 $s0, -6411;
   brn @L13;
@L10:
   mov_b32 $s0, -5403;
   brn @L13;
@L7:
   mov_b32 $s0, -4875;
   brn @L13;
@L12:
   mov_b32 $s0, -8219;
   brn @L13;
@L11:
   mov_b32 $s0, -6939;
   brn @L13;
};

[HSAIL] heap=0x00007f95b8019cc0
[HSAIL] base=0x05a00000, capacity=210763776
External method:com.oracle.graal.compiler.hsail.test.IntAddTest.run([I[I[II)V
installCode0: ExternalCompilationResult
[HSAIL] sig:([I[I[II)V args length=3, _parameter_count=4
[HSAIL] static method
[HSAIL] HSAILKernelArguments::do_array, _index=0, 0x82b21970, is a [I
[HSAIL] HSAILKernelArguments::do_array, _index=1, 0x82b477f0, is a [I
[HSAIL] HSAILKernelArguments::do_array, _index=2, 0x82b479e0, is a [I
[HSAIL] HSAILKernelArguments::not pushing trailing int

Time: 0.208

OK (1 test)

Note you must use the extra option -XX:+GPUOffload to enable offloading and use -XX:+TraceGPUInteraction to see extra messages about GPU initialization etc.