[Share Experiences] DeepinV23:Nvidia GPU驱动+CUDA+cuDNN+pytorch环境指南 Resolved
Tofloor
poster avatar
吴罗平
deepin
2024-06-28 18:51
Author

卸载自带的Libre Office

sudo apt remove --purge libreoffice* && sudo apt autoremove

sudo rm -rf /etc/libreoffice*

卸载搜狗输入法

sudo apt install com.sogou.ime.ng.fcitx5.deepin && sudo apt autoremove

卸载/重装深度浏览器(重装过后的浏览器带插件)

ll-cli uninstall org.deepin.browser && sudo apt autoremove

sudo apt install org.deepin.browser

必要库

sudo apt install -y console-setup zstd

安装Git

sudo apt install -y git

安装Golang

sudo apt install -y golang

安装rust

sudo apt install -y rustc cargo

安装Nodejs

sudo apt install -y nodejs npm

### 安装ffmpeg

sudo apt install -y ffmpeg

安装Docker

普通安装

sudo apt update && sudo apt upgrade -y
sudo apt -y install curl wget gnupg dpkg apt-transport-https ca-certificates lsb-release software-properties-common
# 安装 docker 及其组件
sudo apt -y install docker.io docker-compose  
# 更新docker-compose到最新版本
curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-Linux-x86_64 > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
# sudo service docker start
# sudo systemctl enable docker
# sudo systemctl is-enabled docker
# 授权当前用户(普通角色)
sudo usermod -aG docker $USER
sudo chmod a+rw /var/run/docker.sock
docker info

安装Nvidia

安装Nvidia GPU驱动(源内)[如果在安装系统时安装过的可以略过]

sudo apt purge nvidia-* && sudo apt autopurge -y # 或者 sudo nvidia-uninstall
sudo apt install nvidia-driver nvidia-smi nvidia-settings nvidia-vulkan-icd nvidia-driver-libs:i386 libnvidia-ml1:i386 libxnvctrl0:i386 libvulkan1 libvulkan1:i386

安装 nvidia-smi

sudo apt install nvidia-smi

安装 cuda

apt search libcuda
apt install libcuda1
  • 注:不同的源,同一个库的命名方式可能不同,如 libcuda1-465libcuda1
  1. 下载对应版本的CUDA(通过 nvidia-smi命令获取版本)并安装
# ps:https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64
wget https://developer.download.nvidia.com/compute/cuda/12.6.0/local_installers/cuda_12.6.0_560.28.03_linux.run
sudo sh cuda_12.6.0_560.28.03_linux.run

之前已经安装过nvidia驱动了,所以这里需要手动去掉第一项

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-12.6/

Please make sure that
 -   PATH includes /usr/local/cuda-12.6/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-12.6/lib64, or, add /usr/local/cuda-12.6/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.6/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 555.00 is required for CUDA 12.6 functionality to work.
To install the driver using this installer, run the following command, replacing  with the name of this run file:
    sudo .run --silent --driver

Logfile is /var/log/cuda-installer.log
  1. 添加环境变量 vi ~/.bashrc
# 添加统一的Cuda路径,以防万一为了方便维护cuda多个版本
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
  1. 刷新环境变量 source ~/.bashrc
  2. 建立软链接 sudo ln -s /usr/local/cuda-12.6/ /usr/local/cuda
  3. 验证:nvcc -V

截图_选择区域_20240629084819.png

  1. 下载 cuDNN

ps:https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz/

  1. 解压缩:tar -xvf cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz && cd cudnn-linux-x86_64-8.9.7.29_cuda12-archive
  2. 复制到CUDA目录下
sudo cp -p lib/* /usr/local/cuda/lib64/
sudo cp -p include/* /usr/local/cuda/include/
sudo chmod a+r /usr/local/cuda/include/* /usr/local/cuda/lib64/*
  1. 验证安装:

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

  1. 安装 [pytorch](https://pytorch.org/get-started/previous-versions/) 并验证(不高于GPU的CUDA版本即可)

conda create -n pytorch python=3.12.5

conda activate pytorch

conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia

python3 -c "import torch; print('GPU is OK?', torch.cuda.is_available())"

ps: 请注意在虚拟环境中进行,如果直接安装在系统自带的环境中,请使用sudo apt install python3-库文件名(或模块名)

参考资料

Reply Favorite View the author
All Replies
神末shenmo
deepin
Spark-App
2024-06-28 19:57
#1

源里面的就可以用cuda吧

Reply View the author
吴罗平
deepin
2024-06-28 20:40
#2
神末shenmo

源里面的就可以用cuda吧

不清楚源里面的版本,我用的都是最新版

Reply View the author
新之助
deepin
2024-07-03 09:48
#3

源内的直接用就好了,运行sudo apt install nvidia*就好了

Reply View the author
新之助
deepin
2024-07-03 09:49
#4

源内的版本
image.png

Reply View the author
somalily
deepin
2024-08-07 15:07
#5
神末shenmo

源里面的就可以用cuda吧

你也是 apt install nvidia*吗?

Reply View the author