ubuntu22.04 A100基于apt安装cuda-driver/cuda-toolkit
在 Ubuntu 22.04.3 上通过 apt 安装 NVIDIA A100 GPU 的开发支持库(如 CUDA、cuDNN、NCCL 等),推荐使用 NVIDIA 官方的 APT 仓库来安装。
env
- ubuntu22.04.3 LTS
- cuda-12.8 nvidia-A100
1. 添加 NVIDIA APT 仓库(推荐官方源)
- https://developer.download.nvidia.com/compute/cuda/repos/
- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/
1 | sudo apt update |
2. 安装驱动NVIDIA 官方 CUDA Toolkit
2.1.默认安装最新版本runtime + driver(兼容)
1 | sudo apt install -y cuda # 默认安装最新版本cuda |
这将安装:
- NVIDIA 驱动(通常是最新版本)
cuda-toolkit(包括nvcc编译器、头文件、运行时等)- 各种开发库(如
libcudart-dev、libcublas-dev等)
2.2.安装特定版本
1.查看支持cuda版本
root@gpu-develop-dev:~# apt-cache search cuda | grep cuda-12 # 模糊匹配
cuda-12-0 - CUDA 12.0 meta-package
cuda-12-1 - CUDA 12.1 meta-package
libnvjpeg2k0-cuda-12 - Runtime libraries for libnvjpeg2k for CUDA 12
libnvjpeg2k0-dev-cuda-12 - Development headers and symlinks for libnvjpeg2k for CUDA 12
libnvjpeg2k0-static-cuda-12 - Static libraries for libnvjpeg2k for CUDA 12
nvjpeg2k-cuda-12 - NVIDIA nvJPEG 2000 for CUDA 12
cuquantum-cuda-12 - NVIDIA cuQuantum SDK for CUDA 12
libcuquantum0-cuda-12 - Runtime libraries for libcuquantum for CUDA 12
libcuquantum0-dev-cuda-12 - Development headers and symlinks for libcuquantum for CUDA 12
libcuquantum0-static-cuda-12 - Static libraries for libcuquantum for CUDA 12
cuda-12-2 - CUDA 12.2 meta-package
libnvtiff0-cuda-12 - Runtime libraries for libnvtiff for CUDA 12
libnvtiff0-dev-cuda-12 - Development headers and symlinks for libnvtiff for CUDA 12
libnvtiff0-static-cuda-12 - Static libraries for libnvtiff for CUDA 12
nvtiff-cuda-12 - NVIDIA nvTIFF for CUDA 12
cuda-12-3 - CUDA 12.3 meta-package
cudss-cuda-12 - NVIDIA cuDSS for CUDA 12
libcudss0-cuda-12 - Runtime libraries for libcudss for CUDA 12
libcudss0-dev-cuda-12 - Development headers and symlinks for libcudss for CUDA 12
libcudss0-static-cuda-12 - Static libraries for libcudss for CUDA 12
libcal0-cuda-12 - Runtime libraries for libcal for CUDA 12
libcal0-dev-cuda-12 - Development headers and symlinks for libcal for CUDA 12
libcal-cuda-12 - NVIDIA cal library for CUDA 12
libnvimgcodec-cuda-12 - Runtime libraries for nvImageCodec for CUDA
cusolvermp-cuda-12 - NVIDIA cuSOLVERMp for CUDA 12
libcusolvermp0-cuda-12 - Runtime libraries for libcusolvermp for CUDA 12
libcusolvermp0-dev-cuda-12 - Development headers and symlinks for libcusolvermp for CUDA 12
cudnn9-cuda-12-3 - NVIDIA cuDNN for CUDA 12.3
cudnn9-cuda-12 - NVIDIA cuDNN for CUDA 12
cuda-12-4 - CUDA 12.4 meta-package
cudnn9-cuda-12-4 - NVIDIA cuDNN for CUDA 12.4
cuda-12-5 - CUDA 12.5 meta-package
cudnn9-cuda-12-5 - NVIDIA cuDNN for CUDA 12.5
libnppplus0-cuda-12 - Runtime libraries for libnppplus for CUDA 12
libnppplus0-dev-cuda-12 - Development headers and symlinks for libnppplus for CUDA 12
libnppplus0-static-cuda-12 - Static libraries for libnppplus for CUDA 12
nppplus-cuda-12 - NVIDIA NPP Plus for CUDA 12
cuda-12-6 - CUDA 12.6 meta-package
cudnn9-cuda-12-6 - NVIDIA cuDNN for CUDA 12.6
libnvcomp4-cuda-12 - Runtime libraries for libnvcomp for CUDA 12
libnvcomp4-dev-cuda-12 - Development headers and symlinks for libnvcomp for CUDA 12
libnvcomp4-static-cuda-12 - Static libraries for libnvcomp for CUDA 12
nvcomp-cuda-12 - NVIDIA nvCOMP for CUDA 12
libnvshmem3-cuda-12 - Runtime libraries for libnvshmem for CUDA 12
libnvshmem3-dev-cuda-12 - Development headers and symlinks for libnvshmem for CUDA 12
libnvshmem3-static-cuda-12 - Static libraries for libnvshmem for CUDA 12
nvshmem-cuda-12 - NVIDIA nvshmem for CUDA 12
cudnn9-cuda-12-8 - NVIDIA cuDNN for CUDA 12.8
cuda-12-8 - CUDA 12.8 meta-package
cufftmp-cuda-12 - NVIDIA cuFFTMp CUDA 12 .
libcufftmp11-cuda-12 - Runtime libraries for libcufftmp for CUDA 12
libcufftmp11-dev-cuda-12 - Development headers and symlinks for libcufftmp for CUDA 12
cuda-12-9 - CUDA 12.9 meta-package
cudnn9-cuda-12-9 - NVIDIA cuDNN for CUDA 12.9
libcudnn9-cuda-12 - cuDNN runtime libraries for CUDA 12.9
cudnn9-jit-cuda-12 - NVIDIA cuDNN-jit for CUDA 12
cudnn9-jit-cuda-12-9 - NVIDIA cuDNN-jit for CUDA 12.9
libcudnn9-dev-cuda-12 - cuDNN development libraries for CUDA 12.9
libcudnn9-headers-cuda-12 - cuDNN header files for CUDA 12.9
libcudnn9-jit-cuda-12 - cuDNN-jit runtime libraries for CUDA 12.9
libcudnn9-jit-dev-cuda-12 - cuDNN-jit development libraries for CUDA 12.9
libcudnn9-static-cuda-12 - cuDNN static libraries for CUDA 12.9
cublasmp-cuda-12 - NVIDIA cuBLASMp library for multi-GPU, multi-node distributed basic dense linear algebra.
libcublasmp0-cuda-12 - NVIDIA cuBLASMp library for multi-GPU, multi-node distributed basic dense linear algebra.
libcublasmp0-dev-cuda-12 - NVIDIA cuBLASMp library for multi-GPU, multi-node distributed basic dense linear algebra.
root@gpu-develop-dev:~# apt-cache search cuda | egrep ^cuda-12 # 精准匹配
cuda-12-0 - CUDA 12.0 meta-package
cuda-12-1 - CUDA 12.1 meta-package
cuda-12-2 - CUDA 12.2 meta-package
cuda-12-3 - CUDA 12.3 meta-package
cuda-12-4 - CUDA 12.4 meta-package
cuda-12-5 - CUDA 12.5 meta-package
cuda-12-6 - CUDA 12.6 meta-package
cuda-12-8 - CUDA 12.8 meta-package
cuda-12-9 - CUDA 12.9 meta-package
2.安装特定版本cuda
apt install -y cuda-12-8
3.cuda路径
root@gpu-develop-dev:~# ll /usr/local/cuda
cuda/ cuda-12/ cuda-12.8/
root@gpu-develop-dev:~# ll /usr/local/cuda
lrwxrwxrwx 1 root root 22 Jul 25 12:13 /usr/local/cuda -> /etc/alternatives/cuda/
root@gpu-develop-dev:~# ll /usr/local/cuda/
total 128
drwxr-xr-x 3 root root 4096 Jul 25 12:13 bin/
drwxr-xr-x 5 root root 4096 Jul 25 12:12 compute-sanitizer/
drwxr-xr-x 3 root root 4096 Jul 25 12:12 doc/
-rw-r--r-- 1 root root 160 Feb 13 18:43 DOCS
-rw-r--r-- 1 root root 63021 Feb 13 18:43 EULA.txt
drwxr-xr-x 5 root root 4096 Jul 25 12:13 extras/
drwxr-xr-x 3 root root 4096 Jul 25 12:13 gds/
lrwxrwxrwx 1 root root 28 Feb 13 19:31 include -> targets/x86_64-linux/include/
lrwxrwxrwx 1 root root 24 Feb 13 19:29 lib64 -> targets/x86_64-linux/lib/
drwxr-xr-x 7 root root 4096 Jul 25 12:13 libnvvp/
drwxr-xr-x 2 root root 4096 Jul 25 12:13 nsightee_plugins/
drwxr-xr-x 3 root root 4096 Jul 25 12:13 nvml/
drwxr-xr-x 6 root root 4096 Jul 25 12:11 nvvm/
-rw-r--r-- 1 root root 524 Feb 13 18:43 README
drwxr-xr-x 3 root root 4096 Jul 25 12:12 share/
drwxr-xr-x 2 root root 4096 Jul 25 12:12 src/
drwxr-xr-x 3 root root 4096 Jul 25 12:10 targets/
drwxr-xr-x 2 root root 4096 Jul 25 12:13 tools/
-rw-r--r-- 1 root root 3306 Mar 4 13:02 version.
root@gpu-develop-dev:~# ll /usr/local/cuda/bin/
total 196496
-rwxr-xr-x 1 root root 88848 Feb 22 13:51 bin2c*
lrwxrwxrwx 1 root root 4 Feb 22 13:55 computeprof -> nvvp*
-rwxr-xr-x 1 root root 112 Feb 22 13:03 compute-sanitizer*
drwxr-xr-x 2 root root 4096 Jul 25 12:11 crt/
-rwxr-xr-x 1 root root 8718680 Feb 22 13:51 cudafe++*
-rwxr-xr-x 1 root root 1758 Feb 13 20:26 cuda-gdb*
-rwxr-xr-x 1 root root 14072248 Feb 13 20:26 cuda-gdb-minimal*
-rwxr-xr-x 1 root root 14898320 Feb 13 20:26 cuda-gdb-python3.10-tui*
-rwxr-xr-x 1 root root 14897952 Feb 13 20:26 cuda-gdb-python3.11-tui*
-rwxr-xr-x 1 root root 14906976 Feb 13 20:26 cuda-gdb-python3.12-tui*
-rwxr-xr-x 1 root root 14898456 Feb 13 20:26 cuda-gdb-python3.8-tui*
-rwxr-xr-x 1 root root 14898720 Feb 13 20:26 cuda-gdb-python3.9-tui*
-rwxr-xr-x 1 root root 765328 Feb 13 20:26 cuda-gdbserver*
-rwxr-xr-x 1 root root 75928 Feb 13 19:31 cu++filt*
-rwxr-xr-x 1 root root 568992 Feb 13 19:28 cuobjdump*
-rwxr-xr-x 1 root root 1249368 Feb 22 13:51 fatbinary*
-rwxr-xr-x 1 root root 3826 Mar 4 13:02 ncu*
-rwxr-xr-x 1 root root 3616 Mar 4 13:02 ncu-ui*
-rwxr-xr-x 1 root root 1580 Feb 13 20:16 nsight_ee_plugins_manage.sh*
-rwxr-xr-x 1 root root 197 Mar 4 13:02 nsight-sys*
-rwxr-xr-x 1 root root 743 Mar 4 13:02 nsys*
-rwxr-xr-x 1 root root 833 Mar 4 13:02 nsys-ui*
-rwxr-xr-x 1 root root 24828456 Feb 22 13:51 nvcc*
-rwxr-xr-x 1 root root 11032 Feb 22 13:51 __nvcc_device_query*
-rw-r--r-- 1 root root 425 Feb 22 13:51 nvcc.profile
-rwxr-xr-x 1 root root 5898888 Feb 13 19:26 nvdisasm*
-rwxr-xr-x 1 root root 32376952 Feb 22 13:51 nvlink*
-rwxr-xr-x 1 root root 5939552 Feb 14 04:37 nvprof*
-rwxr-xr-x 1 root root 117760 Feb 13 19:25 nvprune*
-rwxr-xr-x 1 root root 285 Feb 22 13:55 nvvp*
-rwxr-xr-x 1 root root 31912192 Feb 22 13:51 ptxas*
4.配置环境变量及验证
tee >>/etc/profile <<<-'EOF'
# cuda-runtime
export PATH=/usr/local/cuda-12.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH
EOF
source/etc/profile
root@gpu-develop-dev:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0
3.安装cuda驱动+utils
仅包含驱动,不包含Toolkit
nivdia驱动和cuda版本对照

1.安装nvidia-drvier && cuda 12.8
1.1.安装驱动&&utils
apt install -y nvidia-driver-570
apt install -y nvidia-utils-570-server # nvidia-smi
1.2.查看安装版本
root@gpu-develop-dev:~# dpkg -l | grep nvidia-driver
ii nvidia-driver-570 570.172.08-0ubuntu1 amd64 NVIDIA driver metapackage
1.3.驱动持久化加载(默认按需加载)
systemctl enable --now nvidia-persistenced
root@gpu-develop-dev:~# systemctl status nvidia-persistenced
● nvidia-persistenced.service - NVIDIA Persistence Daemon
Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2025-07-25 12:50:24 CST; 3s ago
Process: 274819 ExecStart=/usr/bin/nvidia-persistenced --verbose (code=exited, status=0/SUCCESS)
Main PID: 274820 (nvidia-persiste)
Tasks: 1 (limit: 76792)
Memory: 484.0K
CPU: 1.196s
CGroup: /system.slice/nvidia-persistenced.service
└─274820 /usr/bin/nvidia-persistenced --verbose
Jul 25 12:50:23 gpu-develop-dev systemd[1]: Starting NVIDIA Persistence Daemon...
Jul 25 12:50:23 gpu-develop-dev nvidia-persistenced[274820]: Verbose syslog connection opened
Jul 25 12:50:23 gpu-develop-dev nvidia-persistenced[274820]: Started (274820)
Jul 25 12:50:23 gpu-develop-dev nvidia-persistenced[274820]: device 0000:00:06.0 - registered
Jul 25 12:50:24 gpu-develop-dev nvidia-persistenced[274820]: device 0000:00:06.0 - persistence mode enabled.
Jul 25 12:50:24 gpu-develop-dev nvidia-persistenced[274820]: device 0000:00:06.0 - NUMA memory onlined.
Jul 25 12:50:24 gpu-develop-dev nvidia-persistenced[274820]: Local RPC services initialized
Jul 25 12:50:24 gpu-develop-dev systemd[1]: Started NVIDIA Persistence Daemon.
root@gpu-develop-dev:~#
1.4.查看驱动情况
root@gpu-develop-dev:~# nvidia-smi
Fri Jul 25 14:06:07 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08 Driver Version: 570.172.08 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-PCIE-40GB On | 00000000:00:06.0 Off | Off |
| N/A 37C P0 35W / 250W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
1.5.dkms驱动加载情况
root@gpu-develop-dev:~# dkms status
nvidia/570.172.08, 5.15.0-119-generic, x86_64: installed
root@gpu-develop-dev:~# lsmod | grep nvidia
nvidia_uvm 1744896 0
nvidia 90243072 1 nvidia_uvm
drm 622592 7 drm_kms_helper,nvidia,cirrus,drm_ttm_helper,ttm,nouveau
2.查看可以安装版本
root@gpu-develop-dev:~# apt-cache search nvidia-driver |egrep ^nvidia-driver ### 查看可以安装版本
nvidia-driver-390 - NVIDIA driver metapackage
nvidia-driver-418 - Transitional package for nvidia-driver-430
nvidia-driver-418-server - NVIDIA Server Driver metapackage
nvidia-driver-435 - Transitional package for nvidia-driver-455
nvidia-driver-440 - Transitional package for nvidia-driver-450
nvidia-driver-440-server - Transitional package for nvidia-driver-450-server
nvidia-driver-450 - Transitional package for nvidia-driver-460
nvidia-driver-450-server - NVIDIA Server Driver metapackage
nvidia-driver-455 - Transitional package for nvidia-driver-460
nvidia-driver-460 - Transitional package for nvidia-driver-470
nvidia-driver-460-server - Transitional package for nvidia-driver-470-server
nvidia-driver-465 - Transitional package for nvidia-driver-470
nvidia-driver-470 - NVIDIA driver metapackage
nvidia-driver-470-server - NVIDIA Server Driver metapackage
nvidia-driver-495 - Transitional package for nvidia-driver-510
nvidia-driver-510 - Transitional package for nvidia-driver-535
nvidia-driver-510-server - Transitional package for nvidia-driver-515-server
nvidia-driver-515-open - Transitional package for nvidia-driver-535
nvidia-driver-515-server - Transitional package for nvidia-driver-535-server
nvidia-driver-520-open - Transitional package for nvidia-driver-535
nvidia-driver-525-open - NVIDIA driver (open kernel) metapackage (transitional package)
nvidia-driver-525-server - NVIDIA Server Driver metapackage (transitional package)
nvidia-driver-535 - NVIDIA driver metapackage
nvidia-driver-535-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-535-server - NVIDIA Server Driver metapackage
nvidia-driver-535-server-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-545 - NVIDIA driver metapackage
nvidia-driver-545-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-550 - NVIDIA driver metapackage
nvidia-driver-550-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-550-server - NVIDIA Server Driver metapackage
nvidia-driver-550-server-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-565-server - NVIDIA Server Driver metapackage
nvidia-driver-565-server-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-570 - NVIDIA driver metapackage
nvidia-driver-570-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-570-server - NVIDIA Server Driver metapackage
nvidia-driver-570-server-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-575 - NVIDIA driver metapackage
nvidia-driver-575-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-575-server - NVIDIA Server Driver metapackage
nvidia-driver-575-server-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-515 - NVIDIA driver metapackage
nvidia-driver-520 - NVIDIA driver metapackage
nvidia-driver-525 - NVIDIA driver metapackage
nvidia-driver-430 - Transitional package for nvidia-driver-545
nvidia-driver-555 - NVIDIA driver metapackage
nvidia-driver-555-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-assistant - Detect and install the best NVIDIA driver packages for the system
nvidia-driver-530 - Transitional package for nvidia-driver-560
nvidia-driver-530-open - Transitional package for nvidia-driver-560-open
nvidia-driver-560 - NVIDIA driver metapackage
nvidia-driver-560-open - NVIDIA driver (open kernel) metapackage
nvidia-driver-565 - NVIDIA driver metapackage
nvidia-driver-565-open - NVIDIA driver (open kernel) metapackage
4.nvidia-container-toolkit支持容器化调度gpu
- https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html
- https://mirrors.ustc.edu.cn/help/libnvidia-container.html
1.下载镜像 gpgkey
curl -fsSL https://mirrors.ustc.edu.cn/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
2.配置apt源
curl -s -L https://mirrors.ustc.edu.cn/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://nvidia.github.io#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://mirrors.ustc.edu.cn#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
3.更新及安装
apt-get update
apt-get install -y nvidia-container-toolkit
4.配置docker runtime支持gpu
1 | nvidia-ctk runtime configure --runtime=docker |
5.gpu验证
docker run -it --gpus all python:3.13 bash
nvidia-smi # 正常输出则说明驱动及容器使用驱动正常
dev@gpu-develop-dev:~$ nvidia-container-cli --version
cli-version: 1.17.8
lib-version: 1.17.8
build date: 2025-05-30T13:47+00:00
build revision: 6eda4d76c8c5f8fc174e4abca83e513fb4dd63b0
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
5. 安装 cuDNN(可选,需注册账号)
- 登录 https://developer.nvidia.com/cudnn
- 下载适用于 Ubuntu 的
.deb安装包 - 安装示例:
1 | sudo dpkg -i libcudnn8*.deb |
或使用官方仓库(有 CUDA 账号的情况)配置。
--
6. 验证安装
1 | nvcc --version # 验证 CUDA 编译器 |
7.常见可选开发库(APT 包名)
| 库 | APT 包名 |
|---|---|
| cuDNN | libcudnn8, libcudnn8-dev |
| NCCL | libnccl2, libnccl-dev |
| TensorRT | libnvinfer8, libnvinfer-dev |
| Thrust | 已包含在 cuda-toolkit 中 |
| OpenCL | nvidia-opencl-dev, ocl-icd-opencl-dev |