Skip to content

Commit f26877e

Browse files
Merge pull request #2709 from v-albemi/linux-driver
Edit article for sensitive terms
2 parents 2b5cf4d + d4a837e commit f26877e

File tree

2 files changed

+25
-21
lines changed

2 files changed

+25
-21
lines changed

articles/virtual-machines/includes/virtual-machines-n-series-linux-support.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,9 @@ For the latest CUDA drivers and supported operating systems, visit the [NVIDIA](
3737
>
3838
> [vGPU18](https://download.microsoft.com/download/0541e1a5-dff2-4b8c-a79c-96a7664b1d49/NVIDIA-Linux-x86_64-570.195.03-grid-azure.run) is now available for the NVadsA10_v5-series in **public regions only**. vGPU18 for the NVadsA10_v5-series is **not** supported in the Mooncake and Fairfax regions yet. We'll provide an update once vGPU18 becomes supported for the NVadsA10_v5-series in the Mooncake and Fairfax regions.
3939
40-
Microsoft redistributes NVIDIA GRID driver installers for NV and NVv3-series VMs used as virtual workstations or for virtual applications. Install only these GRID drivers on Azure NV VMs, only on the operating systems listed in the following table. These drivers include licensing for GRID Virtual GPU Software in Azure. You don't need to set up a NVIDIA vGPU software license server.
40+
Microsoft redistributes NVIDIA GRID driver installers for NV and NVv3-series VMs used as virtual workstations or for virtual applications. Install only these GRID drivers on Azure NV VMs, only on the operating systems listed in the following table. These drivers include licensing for GRID Virtual GPU Software in Azure. You don't need to set up an NVIDIA vGPU software license server.
4141

42-
The GRID drivers redistributed by Azure don't work on most non-NV series VMs like NC, NCv2, NCv3, ND, and NDv2-series VMs but works on NCasT4v3 series.
42+
The GRID drivers redistributed by Azure don't work on most non-NV series VMs, like NC, NCv2, NCv3, ND, and NDv2-series VMs, but they work on NCasT4v3 series.
4343

4444
For more information on the specific vGPU and driver branch versions, visit the [NVIDIA](https://docs.nvidia.com/grid/) website.
4545

@@ -48,11 +48,11 @@ For more information on the specific vGPU and driver branch versions, visit the
4848
|Ubuntu 20.04 LTS, 22.04 LTS, 24.04 LTS<br/><br/>Red Hat Enterprise Linux 8.6, 8.8, 8.9, 8.10, 9.0, 9.2, 9.3, 9.4, 9.5<br/><br/>SUSE Linux Enterprise Server 15 SP2, 12 SP2,12 SP5<br/><br/>Rocky Linux 8.4| NVIDIA vGPU 18.5, driver branch [R570](https://download.microsoft.com/download/0541e1a5-dff2-4b8c-a79c-96a7664b1d49/NVIDIA-Linux-x86_64-570.195.03-grid-azure.run) <br/><br/> NVIDIA vGPU 18.5, driver branch [R570](https://download.microsoft.com/download/0541e1a5-dff2-4b8c-a79c-96a7664b1d49/NVIDIA-Linux-x86_64-570.195.03-grid-azure.run)
4949

5050
> [!NOTE]
51-
>For Azure NVads A10 v5 VMs we recommend customers to always be on the latest driver version. The latest NVIDIA major driver branch(n) is only backward compatbile with the previous major branch(n-1). For eg, vGPU 17.x is backward compatible with vGPU 16.x only. Any VMs still runnig n-2 or lower may see driver failures when the latest drive branch is rolled out to Azure hosts.
51+
>For Azure NVads A10 v5 VMs, we recommend that you use the latest driver version. The latest NVIDIA major driver branch (n) is only backward compatibile with the previous major branch (n-1). For example, vGPU 17.x is backward compatible with vGPU 16.x only. Driver failures might occur on any VMs still running n-2 or lower when the latest drive branch is rolled out to Azure hosts.
5252
>>
53-
>NVs_v3 VMs only support **vGPU 16 or lower** driver version.
53+
>NVs_v3 VMs only support **vGPU 16 or lower** driver versions.
5454
>>
55-
> GRID Driver 17.3 currently supports only NCasT4_v3 series of VMs. To use this driver, [download and install GRID Driver 17.3 manually](https://download.microsoft.com/download/7/e/c/7ec792c9-3654-4f78-b1a0-41a48e10ca6d/NVIDIA-Linux-x86_64-550.127.05-grid-azure.run) .
55+
> GRID Driver 17.3 currently supports only the NCasT4_v3 series of VMs. To use this driver, [download and install GRID Driver 17.3 manually](https://download.microsoft.com/download/7/e/c/7ec792c9-3654-4f78-b1a0-41a48e10ca6d/NVIDIA-Linux-x86_64-550.127.05-grid-azure.run).
5656
>>
5757
> GRID drivers are having issues with installation on Azure kernel 6.11. To unblock, downgrade the kernel to version 6.8. For more information, see [Known Issues](/azure/virtual-machines/extensions/hpccompute-gpu-linux#known-issues).
5858

articles/virtual-machines/linux/n-series-driver-setup.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: sizes
88
ms.collection: linux
99
ms.topic: how-to
1010
ms.custom: linux-related-content
11-
ms.date: 04/06/2023
11+
ms.date: 12/04/2025
1212
ms.author: vikancha
1313
ms.reviewer: padmalathas, mattmcinnes
1414
# Customer intent: "As a cloud administrator, I want to install and configure NVIDIA GPU drivers on Linux-based N-series VMs, so that I can fully utilize their GPU capabilities for high-performance computing applications."
@@ -21,6 +21,10 @@ ms.reviewer: padmalathas, mattmcinnes
2121
2222
**Applies to:** :heavy_check_mark: Linux VMs
2323

24+
> [!IMPORTANT]
25+
> To align with inclusive language practices, we've replaced the term "blacklist" with "blocklist" throughout this documentation. This change reflects our commitment to avoiding terminology that might carry unintended negative connotations or perceived racial bias.
26+
> However, in code snippets and technical references where "blacklist" is part of established syntax or tooling (for example, configuration files, command-line parameters), the original term is retained to preserve functional accuracy. This usage is strictly technical and doesn't imply any discriminatory intent.
27+
2428
To take advantage of the GPU capabilities of Azure N-series VMs backed by NVIDIA GPUs, you must install NVIDIA GPU drivers. The [NVIDIA GPU Driver Extension](../extensions/hpccompute-gpu-linux.md) installs appropriate NVIDIA CUDA or GRID drivers on an N-series VM. Install or manage the extension using the Azure portal or tools such as the Azure CLI or Azure Resource Manager templates. See the [NVIDIA GPU Driver Extension documentation](../extensions/hpccompute-gpu-linux.md) for supported distributions and deployment steps.
2529

2630
If you choose to install NVIDIA GPU drivers manually, this article provides supported distributions, drivers, and installation and verification steps. Manual driver setup information is also available for [Windows VMs](../windows/n-series-driver-setup.md).
@@ -53,7 +57,7 @@ Then run installation commands specific for your distribution.
5357

5458
### Ubuntu
5559

56-
Ubuntu packages NVIDIA proprietary drivers. Those drivers come directly from NVIDIA and are simply packaged by Ubuntu so that they can be automatically managed by the system. Downloading and installing drivers from another source can lead to a broken system. Moreover, installing third-party drivers requires extra-steps on VMs with TrustedLaunch and Secure Boot enabled. They require the user to add a new Machine Owner Key for the system to boot. Drivers from Ubuntu are signed by Canonical and will work with Secure Boot.
60+
Ubuntu packages NVIDIA proprietary drivers. Those drivers come directly from NVIDIA and are simply packaged by Ubuntu so that they can be automatically managed by the system. Downloading and installing drivers from another source can lead to a broken system. Moreover, installing third-party drivers requires extra steps on VMs with TrustedLaunch and Secure Boot enabled. They require the user to add a new Machine Owner Key for the system to boot. Drivers from Ubuntu are signed by Canonical and will work with Secure Boot.
5761

5862
1. Install `ubuntu-drivers` utility:
5963
```bash
@@ -69,9 +73,9 @@ Ubuntu packages NVIDIA proprietary drivers. Those drivers come directly from NVI
6973
```
7074
4. Download and install the CUDA toolkit from NVIDIA:
7175
> [!NOTE]
72-
> The example shows the CUDA package path for Ubuntu 24.04 LTS. Replace the path specific to the version you plan to use.
76+
> The example shows the CUDA package path for Ubuntu 24.04 LTS. Use the path that's specific to the version you plan to use.
7377
>
74-
> Visit the [NVIDIA Download Center](https://developer.download.nvidia.com/compute/cuda/repos/) or the [NVIDIA CUDA Resources page](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network) for the full path specific to each version.
78+
> Visit the [NVIDIA Download Center](https://developer.download.nvidia.com/compute/cuda/repos/) or the [NVIDIA CUDA Resources page](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network) for the full path that's specific to each version.
7579
>
7680
```bash
7781
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
@@ -156,7 +160,7 @@ For example, CentOS 8 and RHEL 8 need the following steps.
156160
sudo yum install cuda
157161
```
158162
> [!NOTE]
159-
> If you see an error message related to missing packages like vulkan-filesystem then you may need to edit /etc/yum.repos.d/rh-cloud , look for optional-rpms and set enabled to 1
163+
> If you see an error message related to missing packages like vulkan-filesystem, you may need to edit /etc/yum.repos.d/rh-cloud, look for optional-rpms, and set enabled to 1.
160164
>
161165
162166
5. Reboot the VM and proceed to verify the installation.
@@ -217,7 +221,7 @@ To install NVIDIA GRID drivers on NV or NVv3-series VMs, make an SSH connection
217221
sudo apt-get install build-essential ubuntu-desktop -y
218222
sudo apt-get install linux-azure -y
219223
```
220-
3. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NVv2 VMs.) To disable the driver, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following contents:
224+
3. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NVv2 VMs.) To disable the driver, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following content:
221225

222226
```
223227
blacklist nouveau
@@ -240,7 +244,7 @@ To install NVIDIA GRID drivers on NV or NVv3-series VMs, make an SSH connection
240244

241245
6. When you're asked whether you want to run the nvidia-xconfig utility to update your X configuration file, select **Yes**.
242246

243-
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/
247+
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/.
244248

245249
```bash
246250
sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf
@@ -253,7 +257,7 @@ To install NVIDIA GRID drivers on NV or NVv3-series VMs, make an SSH connection
253257
EnableUI=FALSE
254258
```
255259

256-
9. Remove the following from `/etc/nvidia/gridd.conf` if it is present:
260+
9. Remove the following from `/etc/nvidia/gridd.conf` if it's present:
257261

258262
```
259263
FeatureType=0
@@ -278,7 +282,7 @@ The GRID driver installation process does not offer any options to skip kernel m
278282
sudo yum install hyperv-daemons
279283
```
280284

281-
2. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NV3 VMs.) To do this, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following contents:
285+
2. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NV3 VMs.) To do this, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following content:
282286

283287
```
284288
blacklist nouveau
@@ -312,7 +316,7 @@ The GRID driver installation process does not offer any options to skip kernel m
312316

313317
6. When you're asked whether you want to run the nvidia-xconfig utility to update your X configuration file, select **Yes**.
314318

315-
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/
319+
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/.
316320

317321
```bash
318322
sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf
@@ -325,7 +329,7 @@ The GRID driver installation process does not offer any options to skip kernel m
325329
EnableUI=FALSE
326330
```
327331

328-
9. Remove one line from `/etc/nvidia/gridd.conf` if it is present:
332+
9. Remove one line from `/etc/nvidia/gridd.conf` if it's present:
329333

330334
```
331335
FeatureType=0
@@ -345,7 +349,7 @@ If the driver is installed, Nvidia SMI will list the **GPU-Util** as 0% until yo
345349

346350

347351
### X11 server
348-
If you need an X11 server for remote connections to an NV or NVv2 VM, [x11vnc](https://wiki.archlinux.org/title/X11vnc) is recommended because it allows hardware acceleration of graphics. The BusID of the M60 device must be manually added to the X11 configuration file (usually, `etc/X11/xorg.conf`). Add a `"Device"` section similar to the following:
352+
If you need an X11 server for remote connections to an NV or NVv2 VM, [x11vnc](https://wiki.archlinux.org/title/X11vnc) is recommended because it allows hardware acceleration of graphics. The BusID of the M60 device must be manually added to the X11 configuration file (usually `etc/X11/xorg.conf`). Add a `"Device"` section similar to the following:
349353

350354
```
351355
Section "Device"
@@ -359,7 +363,7 @@ EndSection
359363

360364
Additionally, update your `"Screen"` section to use this device.
361365

362-
The decimal BusID can be found by running
366+
You can find the decimal BusID by running
363367

364368
```bash
365369
nvidia-xconfig --query-gpu-info | awk '/PCI BusID/{print $4}'
@@ -381,16 +385,16 @@ else
381385
fi
382386
```
383387

384-
Then, create an entry for your update script in `/etc/rc.d/rc3.d` so the script is invoked as root on boot.
388+
Then create an entry for your update script in `/etc/rc.d/rc3.d` so the script is invoked as root on boot.
385389

386390
## Troubleshooting
387391

388-
* You can set persistence mode using `nvidia-smi` so the output of the command is faster when you need to query cards. To set persistence mode, execute `nvidia-smi -pm 1`. If the VM is restarted, the mode setting goes away. You can always script the mode setting to execute upon startup.
392+
* You can set persistence mode using `nvidia-smi` so the output of the command is faster when you need to query cards. To set persistence mode, run `nvidia-smi -pm 1`. If the VM is restarted, the mode setting goes away. You can always script the mode setting to run upon startup.
389393

390394
* If you updated the NVIDIA CUDA drivers to the latest version and find RDMA connectivity is no longer working, [reinstall the RDMA drivers](#rdma-network-connectivity) to reestablish that connectivity.
391395
* During installation of LIS, if a certain CentOS/RHEL OS version (or kernel) is not supported for LIS, an error “Unsupported kernel version” is thrown. Report this error along with the OS and kernel versions.
392396

393-
* If jobs are interrupted by ECC errors on the GPU (either correctable or uncorrectable), first check to see if the GPU meets any of Nvidia's [RMA criteria for ECC errors](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#faq-pre). If the GPU is eligible for RMA, contact support about getting it serviced; otherwise, reboot your VM to reattach the GPU as described [here](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#bl_reset_reboot). Less invasive methods such as `nvidia-smi -r` don't work with the virtualization solution deployed in Azure.
397+
* If jobs are interrupted by ECC errors on the GPU (either correctable or uncorrectable), first check to see if the GPU meets any of Nvidia's [RMA criteria for ECC errors](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#faq-pre). If the GPU is eligible for RMA, contact support about getting it serviced; otherwise, reboot your VM to reattach the GPU as described [here](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#bl_reset_reboot). Less invasive methods, such as `nvidia-smi -r`, don't work with the virtualization solution deployed in Azure.
394398

395399
## Next steps
396400

0 commit comments

Comments
 (0)