Table 7 General recommendation for the generic HPC system.
From: Accelerating next generation sequencing data analysis with system level optimizations
Parameter Name | Recommended value | Expected % performance improvement (in % | Remarks |
---|---|---|---|
Java heap size | upto 50% of main memory | N/A | GATK execution may fail due to insufficient memory when the heap size is small |
Parallel garbage collection | \(\mathrm{ParallelGC}\) and \(\mathrm{ParallelGCThreads}\) = ‘\(\mathrm{Total}\mathrm{number}\mathrm{of}\mathrm{cores}\)’ | upto 28% | N/A |
CMS garbage collection | \({\mathtt{CMSParallelRemarkEnabled}}\) | upto 28% | Useful in modern HPC architecture |
Java 1.8 | N/A | upto 52% | Don’t use java.io.tmpdir |
CPU frequency scaling | Performance mode | upto 45% | By default, modern HPC architecture uses on-demand mode |
Kernel shared memory | upto 50% of main memory | upto 48% | N/A |
PairHmm library with heap memory | \(\mathrm{java}\mathrm{.library}\mathrm{.path}\) = ‘\(\mathrm{VectorPairHMM}\mathrm{library}\mathrm{path}\)’ | upto 145% | Use architecture specific libraries for GATK HaplotypeCaller |