Tensorflow and Keras

This guide is an adaption of the README file provided by Bernie Kirby in the directory /usr/local/tensorflow on Artemis.

Important

GPU versions of Tensorflow only work on GPU nodes. You cannot run GPU versions of Tensorflow on any other Artemis node.

Tensorflow and Keras are available on Artemis HPC, but you need to install them in a Python virtual environment to use them. We cannot make these software packages available in a module because the different versions conflict with each other. Installing them in a Python virtual environment ensures that your existing enviroment will not conflict with these software packages.

Note

Only the Tensorflow packages on Artemis will work. Tensorflow downloaded from the internet will not work directly on Artemis. If you would like to use a new version of Tensorflow, submit a High Performance Computing request for it to be installed.

To create your own Python virtual environment and install Tensorflow and Keras, enter these commands from a login node:

$ module load python/3.5.1
$ virtualenv --system-site-packages myenv
$ source myenv/bin/activate
(myenv) $ module load cuda/8.0.44
(myenv) $ pip install /usr/local/tensorflow/tensorflow-1.4.0-cp35-cp35m-linux_x86_64.whl
(myenv) $ pip install keras

Alternative Tensorflow versions

Available Tensorflow versions are installed in

/usr/local/tensorflow

Non-GPU version are installed in:

/usr/local/tensorflow/nogpu

To install a non-gpu version of Tensorflow, create another virtual environment using the nogpu Tensorflow package:

$ module load python/3.5.1
$ virtualenv --system-site-packages tf-nogpu
$ source tf-nogpu/bin/activate
(myenv)  pip install /usr/local/tensorflow/nogpu/tensorflow-1.4.0-cp35-cp35m-linux_x86_64.whl

The following table lists versions of Tensorflow and their corresponding Cuda version:

Tensorflow wheel files Cuda version
tensorflow-1.0.0rc0-cp35-cp35m-linux_x86_64.whl 7.5.18
tensorflow-1.1.0rc1-cp35-cp35m-linux_x86_64.whl 7.5.18
tensorflow-1.2.0rc2-cp35-cp35m-linux_x86_64.whl 8.0.44
tensorflow-1.3.0rc1-cp35-cp35m-linux_x86_64.whl 8.0.44
tensorflow-1.4.0-cp35-cp35m-linux_x86_64.wh 8.0.44

Example job script

An example job script for Tensorflow or Keras is:

#!/bin/bash
#PBS -P Project
#PBS -l select=1:ncpus=12:mem=4gb:ngpus=1
#PBS -l walltime=2:00:00

module load python/3.5.1
source myenv/bin/activate
module load cuda/8.0.44

cd $PBS_O_WORKDIR
python myscript.py