Cruft
Contents
Background
Link | Code | Version | Machine | Date |
---|---|---|---|---|
LLNL website | git repo | Kyle Spafford fork | Keeneland | March 2012 |
These instructions can also be used for CoMD
Building Cruft
For OpenCL:
export OPENCL_INCLUDE_DIR=<path to OpenCL include dir>
Modify the CmakeLists.txt and add these lines:
set (CMAKE_CXX_COMPILER tau_cxx.sh) set (CMAKE_C_COMPILER tau_cc.sh)
Then issue
cmake .
You can safety proceed when you encounter reversions.
Selective instrumentation of Loops:
BEGIN_INSTRUMENT_SECTION loops file="eam.c" routine="eamForce#" loops file="ljForce.c" routine="LJ#" END_INSTRUMENT_SECTION
For the OpenCL binary edit src-ocl/eam_kernels.c to move this section about the typedef CL_REAL_T real_t;
#if defined(cl_khr_fp64) // Khronos extension available? #pragma OPENCL EXTENSION cl_khr_fp64 : enable #elif defined(cl_amd_fp64) // AMD extension available? #pragma OPENCL EXTENSION cl_amd_fp64 : enable #endif
Then set:
export TAU_OPTIONS="-optShared -optVerbose -optTauSelectFile=`pwd`/select.tau" export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt make
Running Cruft
./cruft -p ag -e -f data/8k.inp.gz
or
./cruft -f data/8k.inp.gz
And for OpenCL accelerated version:
tau_exec -T serial -opencl ./cruftOCL -p ag -e -f data/8k.inp.gz
tau_exec -T serial -opencl ./cruftOCL -f data/8k.inp.gz
Performance Data
EAM method:
First the serial version of Cruft shows two loops in eam.c consumes most of the time.
In comparison the OpenCL accelerated version two kernels dominate the runtime.
One thing you can check with OpenCL application is the time spent in command queue here the table for each kernel:
Profile Data:
File:Cruft-EAM.ppk, File:CruftOCL-EAM.ppk
LJ method:
First the serial version of Cruft shows a single loop accounts for runtime.
In comparison the OpenCL accelerated version the LJ_Force kernel dominate the runtime.
Ones again here is the time spent in the queue for this kernels.
Profile Data: