WSL2 NVIDIA Docker 安装教程
wsl docker nvidia gpu 环境搭建
·
一. 环境
- WSL ubuntu 22.04。未安装WSL的可参考 WSL 安装
- nvidia 驱动正常。wsl 终端执行
nvidia-smi
nvidia-smi
Sun Oct 6 20:37:01 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.58.02 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4080 ... On | 00000000:01:00.0 Off | N/A |
| N/A 40C P3 19W / 95W | 0MiB / 12282MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
二、WSL NVIDIA docker 安装
1. 安装 CUDA Toolkit
-
点击下载链接 CUDA Toolkit Downloads
- 选择 【Linux】-> 【x86_64】->【WSL-Ubuntu】->【2.0】
- Installer Type 任意即可,终端键入CUDA安装界面自动生成的安装指令即可
- 安装完成,配置环境CUDA 安装路径到
~/.bashrc
环境变量中(CUDA默认安装路径为usr/local
,cuda-11.8替换为安装cuda对应的版本)
export PATH=/usr/local/cuda-11.8/bin::$PATH
- CUDA Toolkit 安装成功。终端输入
nvcc --version
正常输出
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
2. 安装 Docker
- 编辑
/etc/wsl.conf
启用WSLsystemd
功能,便于后续docker服务自启
[boot]
systemd=true
- 卸载旧版本 Docker
sudo apt-get purge docker-ce docker-ce-cli containerd.io docker-compose-plugin
- 安装docker
curl https://get.docker.com | sh
- 查看docker 服务状态
sudo systemctl status docker
- 如果docker服务没有运行,重启docker 服务
sudo systemctl restart docker
3. 安装 nvidia docker
- 配置nvidia-docker源
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list
- 安装nvidia-docker
sudo apt-get update
sudo apt-get install -y nvidia-docker2
如遇到nvidia.github.io 无法访问
,修改DNS服务器
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey
curl: (7) Failed to connect to nvidia.github.io port 443 after 1 ms: Connection refused
- 编辑
/etc/resolv.conf
, 修改nameserver
为114.114.114.114
# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
# nameserver 10.255.255.254
nameserver 114.114.114.114
search lan
4. 配置 Docker
- 配置docker镜像仓库源及docker runtimes,编辑
/etc/docker/daemon.json
,配置如下
{
"registry-mirrors": [
"https://dockerproxy.cn",
"https://docker.rainbond.cc",
"https://docker.udayun.com",
"https://docker.rainbond.cc",
"https://hub.uuuadc.top",
"https://docker.anyhub.us.kg",
"https://dockerhub.jobcher.com",
"https://dockerhub.icu",
"https://docker.ckyl.me",
"https://docker.awsl9527.cn"
],
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
},
"default-runtime": "nvidia"
}
字段名 | 描述 |
---|---|
registry-mirrors | 配置docker 仓库镜像,解决某些网络无法拉取docker镜像问题 |
runtimes | Docker can use the NVIDIA Container Runtime. 参考container-toolkit/latest/install-guide.html |
- 修改
/etc/nvidia-container-runtime/config.toml
字段/no-cgroups = true
为false
sudo sed -i 's/no-cgroups = true/no-cgroups = false/' /etc/nvidia-container-runtime/config.toml
解决docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
找不到设备问题(Error: only 0 Devices available, 1 requested. Exiting.
参考链接: Docker container with CUDA does not see my GPU )
5. 重启 Docker 服务
sudo systemctl restart docker
三、NVIDIA docker 验证
- N-body simulation container 验证 wsl docker gpu 是否配置成功
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
配置成功后正常输出
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies=<N> (number of bodies (>= 1) to run in simulation)
-device=<d> (where d=0,1,2.... for the CUDA device to use)
-numdevices=<i> (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 8.9 is undefined. Default to use 128 Cores/SM
MapSMtoArchName for SM 8.9 is undefined. Default to use Ampere
GPU Device 0: "Ampere" with compute capability 8.9
> Compute 8.9 CUDA device: [NVIDIA GeForce RTX 4080 Laptop GPU]
59392 bodies, total time for 10 iterations: 52.937 ms
= 666.345 billion interactions per second
= 13326.896 single-precision GFLOP/s at 20 flops per interaction
四、结束语
如安装遇失败的话,敬请留言。
开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!
更多推荐
已为社区贡献3条内容
所有评论(0)