如何在Ubuntu 14.04 x64上安装Theano,并将其configuration为使用GPU?

我试图按照目前的Ubuntu简易安装优化的Theano的说明,但它不工作:每当我运行一个Theano脚本使用GPU,它给了我错误消息:

CUDA已安装,但设备gpu不可用(错误:无法获取可用gpus的数量:未检测到支持CUDA的设备)


更具体地说,按照链接网页中的说明,我执行了以下步骤:

# Install Theano sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git sudo pip install Theano # Install Nvidia drivers and CUDA sudo apt-get install nvidia-current sudo apt-get install nvidia-cuda-toolkit 

然后我重新启动并尝试运行:

 THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python gpu_test.py # gpu_test.py comes from http://deeplearning.net/software/theano/tutorial/using_gpu.html 

但是我得到:

 f@f-Aurora-R4:~$ THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32,cuda.root=/usr/lib/nvidia-cuda-toolkit' python gpu_test.py WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected) [Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)] Looping 1000 times took 2.199992 seconds Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761 1.62323284] Used the cpu 

(我在Ubuntu 14.04.4 LTS x64和Kubuntu 14.04.4 LTS x64上测试了以下内容,我想它应该适用于大多数Ubuntu版本)

安装Theano并配置GPU(CUDA)

官方网站上的说明已过时。 相反,您可以使用下面的说明(假设新安装的Kubuntu 14.04 LTS x64):

 # Install Theano sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git sudo pip install Theano # Install Nvidia drivers, CUDA and CUDA toolkit, following some instructions from http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb # Got the link at https://developer.nvidia.com/cuda-downloads sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb sudo apt-get update sudo apt-get install cuda sudo reboot 

在这一点上,运行nvidia-smi应该可以工作,但是运行nvcc将不起作用。

 # Execute in console, or (add in ~/.bash_profile then run "source ~/.bash_profile"): export PATH=/usr/local/cuda-7.5/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH 

那时, nvidia-sminvcc应该可以工作。

要测试Theano是否能够使用GPU:

gpu_test.py复制以下gpu_test.py

 # Start gpu_test.py # From http://deeplearning.net/software/theano/tutorial/using_gpu.html#using-gpu from theano import function, config, shared, sandbox import theano.tensor as T import numpy import time vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000 rng = numpy.random.RandomState(22) x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([], T.exp(x)) print(f.maker.fgraph.toposort()) t0 = time.time() for i in xrange(iters): r = f() t1 = time.time() print("Looping %d times took %f seconds" % (iters, t1 - t0)) print("Result is %s" % (r,)) if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]): print('Used the cpu') else: print('Used the gpu') # End gpu_test.py 

并运行它:

 THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32' python gpu_test.py 

应该返回:

 f@f-Aurora-R4:~$ THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32' python gpu_test.py Using gpu device 0: GeForce GTX 690 [GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)] Looping 1000 times took 0.658292 seconds Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761 1.62323296] Used the gpu 

要知道您的CUDA版本:

 ​nvcc -V 

例:

 username@server:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2015 NVIDIA Corporation Built on Tue_Aug_11_14:27:32_CDT_2015 Cuda compilation tools, release 7.5, V7.5.17 

添加cuDNN

要添加cuDNN(来自http://deeplearning.net/software/theano/library/sandbox/cuda/dnn.html的说明)&#xFF1A;

  1. https://developer.nvidia.com/rdp/cudnn-download下载cuDNN(需要注册,这是免费的&#xFF09;
  2. tar -xvf cudnn-7.0-linux-x64-v3.0-prod.tgz
  3. 执行以下任一操作

选项1:将*.h文件复制到CUDA_ROOT/include并将*.so*文件复制到CUDA_ROOT/lib64 (默认情况下, CUDA_ROOT是Linux上的/usr/local/cuda )。

 sudo cp cuda/lib64/* /usr/local/cuda/lib64/ sudo cp cuda/include/cudnn.h /usr/local/cuda/include/ 

选项2:

 export LD_LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH export CPATH=/home/user/path_to_CUDNN_folder/include:$CPATH export LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH 

默认情况下,Theano会检测是否可以使用cuDNN。 如果是的话,它会使用它。 否则,Theano优化不会引入cuDNN操作。 所以如果用户没有手动引入,Theano仍然可以工作。

如果Theano不能使用cuDNN,要得到一个错误,使用这个Theano标志: optimizer_including=cudnn

例:

 THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32,optimizer_including=cudnn' python gpu_test.py 

要知道你的cuDNN版本:

 cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 

添加CNMeM

CNMeM库是“帮助深度学习框架管理CUDA内存的简单库”。

 # Build CNMeM without the unit tests git clone https://github.com/NVIDIA/cnmem.git cnmem cd cnmem mkdir build cd build sudo apt-get install -y cmake cmake .. make # Copy files to proper location sudo cp ../include/cnmem.h /usr/local/cuda/include sudo cp *.so /usr/local/cuda/lib64/ cd ../.. 

要使用Theano,您需要添加lib.cnmem标志。 例:

 THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=0.8,optimizer_including=cudnn' python gpu_test.py 

脚本的第一个输出应该是:

 Using gpu device 0: GeForce GTX TITAN X (CNMeM is enabled with initial size: 80.0% of memory, cuDNN 5005) 

lib.cnmem=0.8意味着可以使用多达80%的GPU。

据报道,CNMEM提供了一些有趣的速度改进,并得到Theano,Torch和Caffee的支持。

Theano – 来源1 :

加速取决于许多因素,如形状和模型本身。 加速从0增加到2倍。

Theano – 来源2 :

如果你不改变Theano标志allow_gc,你可以期待在GPU上加速20%。 在某些情况下(小型号),我们看到加速了50%。


在多个CPU内核上运行Theano

请注意,您可以使用OMP_NUM_THREADS=[number_of_cpu_cores]标志在多个CPU核心上运行Theano。 例:

 OMP_NUM_THREADS=4 python gpu_test.py 

脚本theano/misc/check_blas.py输出关于哪个BLAS被使用的信息:

 cd [theano_git_directory] OMP_NUM_THREADS=4 python theano/misc/check_blas.py 

运行Theano的测试套件:

 nosetests theano 

要么

 sudo pip install nose-parameterized import theano theano.test() 

常见问题:

  • 导入theano:AttributeError:'模块'对象没有属性'find_graphviz'