Working with GPU packages¶
The Anaconda Distribution includes several packages that use the GPU as an accelerator to increase performance, sometimes by a factor of five or more. These packages can dramatically improve machine learning and simulation use cases, especially deep learning.
While both AMD and NVIDIA are major vendors of GPUs, NVIDIA is currently the most common GPU vendor for machine learning and cloud computing. The information on this page applies only to NVIDIA GPUs. As of August 27th, 2018, experimental AMD GPU packages for Anaconda are in progress but not yet officially supported.
GPU acceleration requires the author of a project such as TensorFlow to implement GPU-specific code paths for algorithms that can be executed on the GPU. A GPU-accelerated project will call out to NVIDIA-specific libraries for standard algorithms or use the NVIDIA GPU compiler to compile custom GPU code. Only the algorithms specifically modified by the project author for GPU usage will be accelerated, and the rest of the project will still run on the CPU.
For most packages, GPU support is either a compile-time or run-time choice, allowing a variant of the package to be available for CPU-only usage. When GPU support is a compile-time choice, Anaconda will typically need to build two versions of the package, to allow the user to choose between the “regular” version of the project that runs on CPU only and the “GPU-enabled” version of the project that runs on GPU.
Due to the different ways that CUDA support is enabled by project authors, there
is no universal way to detect GPU support in a package. For many GPU-enabled
packages, there is a dependency on the
cudatoolkit package. Other packages
such as Numba do not have a
cudatoolkit dependency, because they can be used
without the GPU.
NVIDIA released the CUDA API for GPU programming in 2006, and all new NVIDIA GPUs released since that date have been CUDA-capable regardless of market. Although any NVIDIA GPU released in the last 5 years will technically work with Anaconda, these are the best choices for machine learning and specifically model training use cases:
- Tesla P100 or V100
- Titan Xp or V
- GeForce 1080 or 1080 Ti
- Various recent Quadro models
Deployed models do not always require a GPU. When a GPU is required for a deployed model, there are other Tesla GPU models that are more optimized for inference than training, such as the Tesla M4, M40, P4 and P40.
Cloud and on-premise data center deployments require Tesla cards, whereas the GeForce, Quadro, and Titan options are suitable for use in workstations.
Most users will have an Intel or AMD 64-bit CPU. We recommend having at least two to four times more CPU memory than GPU memory, and at least 4 CPU cores to support data preparation before model training. There are a limited number of Anaconda packages with GPU support for IBM POWER 8/9 systems as well.
The best performance and user experience for CUDA is on Linux systems, and Windows is also supported. No Apple computers have been released with an NVIDIA GPU since 2014, so they generally lack the memory for machine learning applications and only have support for Numba on the GPU.
Anaconda requires that the user has installed a recent NVIDIA driver that meets the version requirements in the table below. Anaconda does not require the installation of the CUDA SDK.
Ubuntu and some other Linux distributions ship with a third party open source driver for NVIDIA GPUs called Nouveau. CUDA requires replacing the Nouveau driver with the official closed source NVIDIA driver.
All other CUDA libraries are supplied as conda packages.
GPU-enabled packages are built against a specific version of CUDA. Currently
supported versions include CUDA 8, 9.0 and 9.2. The NVIDIA drivers are designed
to be backward compatible to older CUDA versions, so a system with NVIDIA driver
version 384.81 can support CUDA 9.0 packages and earlier. As a result, if a user
is not using the latest NVIDIA driver, they may need to manually pick a
particular CUDA version by selecting the version of the
package in their environment. To select a
cudatoolkit version, add a
selector such as
cudatoolkit=8.0 to the version specification.
Required NVIDIA driver versions, excerpted from the NVIDIA CUDA Toolkit Release Notes:
|CUDA Version||Linux x86_64 Driver Version||Windows x86_64 Driver Version|
|CUDA 8.0 (8.0.61 GA2)||>= 375.26||>= 376.51|
|CUDA 9.0 (9.0.76)||>= 384.81||>= 385.54|
|CUDA 9.2 (9.2.88)||>= 396.26||>= 397.44|
|CUDA 9.2 (9.2.148 Update 1)||>= 396.37||>= 398.26|
Sometimes specific GPU hardware generations have a minimum CUDA version. As of August 27th, 2018, the only relevant constraint is that the Tesla V100 and Titan V (using the “Volta” GPU architecture) require CUDA 9 or later.
TensorFlow is a general machine learning library, but most popular for deep
learning applications. There are three supported variants of the
package in Anaconda, one of which is the NVIDIA GPU version. This is selected by
installing the meta-package
conda install tensorflow-gpu
Other packages such as Keras depend on the generic
tensorflow package name
and will use whatever version of TensorFlow is installed. This makes it easy to
switch between variants in an environment.
PyTorch is another machine learning library with a deep learning focus. PyTorch detects GPU availability at run-time, so the user does not need to install a different package for GPU support.
conda install pytorch
Caffe was one of the first popular deep learning libraries.
conda install caffe-gpu
Chainer/CuPy (Linux only)¶
Chainer is a deep learning library that uses NumPy or CuPy for computations.
conda install chainer
Chainer’s companion project CuPy is a GPU-accelerated clone of the NumPy API that can be used as a drop-in replacement for NumPy with a few changes to user code. When CuPy is installed, Chainer is GPU-accelerated. CuPy can also be used on its own for general array computation.
conda install cupy
XGBoost is a machine learning library that implements gradient-boosted decision trees. Training several forms of trees is GPU-accelerated.
conda install py-xgboost-gpu
MXNet is a machine learning library supported by various industry partners, most
notably Amazon. Like TensorFlow, it comes in three variants, with the GPU
variant selected by the
conda install mxnet-gpu
Numba is a general-purpose JIT compiler for Python functions. It provides a way
to implement custom GPU algorithms in purely Python syntax when the
cudatoolkit package is present.
conda install numba cudatoolkit