Skip to content

Java

Java is a programming language developed by Oracle, which provides Java runtime and software development kit for many computer platforms.

Linux CentOS comes with Java which we recommend to try at first. CentOS 6 comes with Java 1.7, CentOS 7 with Java 1.8. Additionally, a possibly more recent Java development kit (JDK) from Oracle is installed as a jdk module. To see what versions are available, issue module spider jdk.

While users can install and use Java GUI based development tools such as NetBeans or Eclipse, to compile and run Java programs on our clusters, for simplicity, we recommend developing Java codes locally on user's laptop or desktop, and using the command line tools on CHPC systems. The command line tools are necessary for batch job submission.

The basic steps of deploying a Java program consist of compilation of the .java source code into a .class bytecode, followed by a launch in the Java virtual machine, as described at this Oracle page. Alternatively, a whole Java code may be supplied in a Java archive (jar) file.

Java source code compilation

To compile Java source, use the javac command. There are a number of options this command takes, several of which are especially useful. In particular -cp option lets defines a search path for additional Java libraries that the code may use (often compressed in Java archive - jar file), and -d option lets us specify the directory where to put the generated class files. For example:

javac -cp /some_path_to_Java_library/lib/otherlib.jar -d bin src/MyJavaCode.java

will build a bin/MyJavaCode.class file from a  located in src directory, and also include otherlib.jar.

Java code execution

Once the Java code is compiled into the class file(s), it is executed with the java command. Similarly to javac, the-cp flag defines the search path for class files and libraries. Sometimes libraries from other languages, such as C/C++ or Fortran, are called by Java programs, search path for those is defined by the -Djava.library.pathflag. For example:

java -cp /some_path_to_Java_library/lib/otherlib.jar:bin -Djava.library.path=/path_to_dynamic_libraries/x86-64_linux MyJavaCode
Running Java in a batch script

Running Java in a batch script can be as simple as calling the java command above. However, note that Java internally supports only thread based parallelization, therefore it is not efficient to run on more than one node (unless one uses distributed-parallel approaches such as JPPF).

As all our cluster nodes contain multiple CPU cores, it is important to assess the parallelization of the Java code to run. There is a number of ways Java can control and limit thread creation. If you have not written the Java code, the easiest would be to run the code for a short period of time on the cluster interactive node and monitor its thread parallelism level with the topcommand. The %CPU column shows show much CPU is used by the java process - for multi-threaded programs we want to see that number to be larger than 100% (100% = one fully used CPU core), and approaching the core count on the node*100 (e.g. on a 16 core node, getting up to 1600).

If the Java code uses thread pools, we can also try to limit the number of threads using the -Dthread.pool.size runtime option, and assess the parallel scaling by changing the pool size.

Once we have verified that the Java code runs reasonably in parallel, we can launch it in a SLURM script, e.g.

#!/bin/tcsh
#SBATCH --time=1:00:00 # walltime, abbreviated by -t
#SBATCH --nodes=1 # number of cluster nodes, abbreviated by -N
#SBATCH -o slurm-%j.out-%N # name of the stdout, using the job number (%j) and the first node (%N)
#SBATCH -e slurm-%j.err-%N # name of the stderr, using job and first node values
#SBATCH --ntasks=1 # number of SLURM tasks, abbreviated by -n
# additional information for allocated clusters
#SBATCH --account=owner-guest
#SBATCH --partition=ember-guest

cd directory_with_Java_code

ml jdk/1.8.0_112

java -cp /some_path_to_Java_library/lib/otherlib.jar:bin -Djava.library.path=/path_to_dynamic_libraries/x86-64_linux MyJavaCode

Note that in this example we are requesting one task on one node, but, without node sharing, so, the whole node will be allocated to this job. Java program then internally utilizes all the node CPU cores.

Last Updated: 7/5/23