GPUs on the OSG
For a while, we have heard of the need for GPU's on the Open Science Grid. Luckily, HCC hosts numerous GPU resources that we are willing to share. And with the release today of HTCondor 8.2, we wanted to integrate OSG GPU resources transparently to the OSG.
Submission
Additionally, the OSG Glidein Factories have a new entry point, CMS_T3_US_Omaha_tusker_gpu, which is available for VO Frontends to submit GPU jobs. Email the glidein factory operators to enable GPU resources for your VO.
Job Structure
We tested submitting a binary compiled with 5.0 to Tusker. It required a wrapper script in order to configure the environment, and to transfer the CUDA library with the job. Details are on the gist along with the example files.
Lessons Learned
- If a job matches more than 1 HTCondor-CE route, then the router will round robin between the routes. Therefore, it is necessary to modify all routes if you wish specific jobs to go to a specific route.
- Grid jobs do not source /etc/profile.d/ on the worker node. I had to manually source those files in the pbs_local_submit_attributes.sh file in order to use the module command and load the CUDA environment.
Leave a comment