Tensorflow

This guide is an adaption of the README file provided by Bernie Kirby in the directory /usr/local/tensorflow and /uar/local/tensorflow/v100 on Artemis.

Important

GPU versions of Tensorflow only work on GPU nodes. You cannot run GPU versions of Tensorflow on any other Artemis node.

Tensorflow and Keras are available on Artemis HPC, but you need to install them in a Python virtual environment to use them. We cannot make these software packages available in a module because the different versions conflict with each other. Installing them in a Python virtual environment ensures that your existing enviroment will not conflict with these software packages.

Note

Only the Tensorflow packages on Artemis will work. Tensorflow downloaded from the internet will not work directly on Artemis. If you would like to use a new version of Tensorflow, submit a High Performance Computing request for it to be installed.

To create your own Python virtual environment and install Tensorflow, enter these commands from a login node:

$ module load python/3.5.1
$ virtualenv --system-site-packages tf
$ source tf/bin/activate
(tf) $ module load cuda/9.1.85
(tf) $ pip install /usr/local/tensorflow/tensorflow-1.7.0rc1-cp35-cp35m-linux_x86_64.whl

The virtual environment here is called “tf”, but you can name it whatever you want.

Alternative Tensorflow versions

Available Tensorflow versions are installed in

/usr/local/tensorflow/v100

Non-GPU version are installed in:

/usr/local/tensorflow/nogpu

To install a non-gpu version of Tensorflow, create another virtual environment using the nogpu Tensorflow package:

$ module load python/3.5.1
$ virtualenv --system-site-packages tf-nogpu
$ source tf-nogpu/bin/activate
(myenv)  pip install /usr/local/tensorflow/nogpu/tensorflow-1.2.0rc2-cp35-cp35m-linux_x86_64.whl

The following table lists versions of Tensorflow and their corresponding Cuda version:

Tensorflow wheel files Cuda version
tensorflow-1.5.0-cp27-cp27m-linux_x86_64.whl 9.1.85
tensorflow-1.5.0-cp35-cp35m-linux_x86_64.wh 9.1.85
tensorflow-1.6.0-cp35-cp35m-linux_x86_64.whl 9.1.85
tensorflow-1.7.0-cp27-cp27m-linux_x86_64.wh 9.1.85
tensorflow-1.7.0-cp35-cp35m-linux_x86_64.wh 9.1.85

Example job script

An example job script for Tensorflow is:

#!/bin/bash
#PBS -P Project
#PBS -l select=1:ncpus=4:mem=4gb:ngpus=1
#PBS -l walltime=2:00:00

module load python/3.5.1
source tf/bin/activate
module load cuda/9.1.85
module load openmpi-gcc/3.0.0-cuda

cd "$PBS_O_WORKDIR"
python myscript.py