You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/virtual-machines/includes/virtual-machines-n-series-linux-support.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,9 +37,9 @@ For the latest CUDA drivers and supported operating systems, visit the [NVIDIA](
37
37
>
38
38
> [vGPU18](https://download.microsoft.com/download/0541e1a5-dff2-4b8c-a79c-96a7664b1d49/NVIDIA-Linux-x86_64-570.195.03-grid-azure.run) is now available for the NVadsA10_v5-series in **public regions only**. vGPU18 for the NVadsA10_v5-series is **not** supported in the Mooncake and Fairfax regions yet. We'll provide an update once vGPU18 becomes supported for the NVadsA10_v5-series in the Mooncake and Fairfax regions.
39
39
40
-
Microsoft redistributes NVIDIA GRID driver installers for NV and NVv3-series VMs used as virtual workstations or for virtual applications. Install only these GRID drivers on Azure NV VMs, only on the operating systems listed in the following table. These drivers include licensing for GRID Virtual GPU Software in Azure. You don't need to set up a NVIDIA vGPU software license server.
40
+
Microsoft redistributes NVIDIA GRID driver installers for NV and NVv3-series VMs used as virtual workstations or for virtual applications. Install only these GRID drivers on Azure NV VMs, only on the operating systems listed in the following table. These drivers include licensing for GRID Virtual GPU Software in Azure. You don't need to set up an NVIDIA vGPU software license server.
41
41
42
-
The GRID drivers redistributed by Azure don't work on most non-NV series VMs like NC, NCv2, NCv3, ND, and NDv2-series VMs but works on NCasT4v3 series.
42
+
The GRID drivers redistributed by Azure don't work on most non-NV series VMs, like NC, NCv2, NCv3, ND, and NDv2-series VMs, but they work on NCasT4v3 series.
43
43
44
44
For more information on the specific vGPU and driver branch versions, visit the [NVIDIA](https://docs.nvidia.com/grid/) website.
45
45
@@ -48,11 +48,11 @@ For more information on the specific vGPU and driver branch versions, visit the
48
48
|Ubuntu 20.04 LTS, 22.04 LTS, 24.04 LTS<br/><br/>Red Hat Enterprise Linux 8.6, 8.8, 8.9, 8.10, 9.0, 9.2, 9.3, 9.4, 9.5<br/><br/>SUSE Linux Enterprise Server 15 SP2, 12 SP2,12 SP5<br/><br/>Rocky Linux 8.4| NVIDIA vGPU 18.5, driver branch [R570](https://download.microsoft.com/download/0541e1a5-dff2-4b8c-a79c-96a7664b1d49/NVIDIA-Linux-x86_64-570.195.03-grid-azure.run) <br/><br/> NVIDIA vGPU 18.5, driver branch [R570](https://download.microsoft.com/download/0541e1a5-dff2-4b8c-a79c-96a7664b1d49/NVIDIA-Linux-x86_64-570.195.03-grid-azure.run)
49
49
50
50
> [!NOTE]
51
-
>For Azure NVads A10 v5 VMs we recommend customers to always be on the latest driver version. The latest NVIDIA major driver branch(n) is only backward compatbile with the previous major branch(n-1). For eg, vGPU 17.x is backward compatible with vGPU 16.x only. Any VMs still runnig n-2 or lower may see driver failures when the latest drive branch is rolled out to Azure hosts.
51
+
>For Azure NVads A10 v5 VMs, we recommend that you use the latest driver version. The latest NVIDIA major driver branch(n) is only backward compatibile with the previous major branch(n-1). For example, vGPU 17.x is backward compatible with vGPU 16.x only. Driver failures might occur on any VMs still running n-2 or lower when the latest drive branch is rolled out to Azure hosts.
52
52
>>
53
-
>NVs_v3 VMs only support **vGPU 16 or lower** driver version.
53
+
>NVs_v3 VMs only support **vGPU 16 or lower** driver versions.
54
54
>>
55
-
> GRID Driver 17.3 currently supports only NCasT4_v3 series of VMs. To use this driver, [download and install GRID Driver 17.3 manually](https://download.microsoft.com/download/7/e/c/7ec792c9-3654-4f78-b1a0-41a48e10ca6d/NVIDIA-Linux-x86_64-550.127.05-grid-azure.run).
55
+
> GRID Driver 17.3 currently supports only the NCasT4_v3 series of VMs. To use this driver, [download and install GRID Driver 17.3 manually](https://download.microsoft.com/download/7/e/c/7ec792c9-3654-4f78-b1a0-41a48e10ca6d/NVIDIA-Linux-x86_64-550.127.05-grid-azure.run).
56
56
>>
57
57
> GRID drivers are having issues with installation on Azure kernel 6.11. To unblock, downgrade the kernel to version 6.8. For more information, see [Known Issues](/azure/virtual-machines/extensions/hpccompute-gpu-linux#known-issues).
Copy file name to clipboardExpand all lines: articles/virtual-machines/linux/n-series-driver-setup.md
+20-16Lines changed: 20 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.subservice: sizes
8
8
ms.collection: linux
9
9
ms.topic: how-to
10
10
ms.custom: linux-related-content
11
-
ms.date: 04/06/2023
11
+
ms.date: 12/04/2025
12
12
ms.author: vikancha
13
13
ms.reviewer: padmalathas, mattmcinnes
14
14
# Customer intent: "As a cloud administrator, I want to install and configure NVIDIA GPU drivers on Linux-based N-series VMs, so that I can fully utilize their GPU capabilities for high-performance computing applications."
> To align with inclusive language practices, we've replaced the term "blacklist" with "blocklist" throughout this documentation. This change reflects our commitment to avoiding terminology that might carry unintended negative connotations or perceived racial bias.
26
+
> However, in code snippets and technical references where "blacklist" is part of established syntax or tooling (for example, configuration files, command-line parameters), the original term is retained to preserve functional accuracy. This usage is strictly technical and doesn't imply any discriminatory intent.
27
+
24
28
To take advantage of the GPU capabilities of Azure N-series VMs backed by NVIDIA GPUs, you must install NVIDIA GPU drivers. The [NVIDIA GPU Driver Extension](../extensions/hpccompute-gpu-linux.md) installs appropriate NVIDIA CUDA or GRID drivers on an N-series VM. Install or manage the extension using the Azure portal or tools such as the Azure CLI or Azure Resource Manager templates. See the [NVIDIA GPU Driver Extension documentation](../extensions/hpccompute-gpu-linux.md) for supported distributions and deployment steps.
25
29
26
30
If you choose to install NVIDIA GPU drivers manually, this article provides supported distributions, drivers, and installation and verification steps. Manual driver setup information is also available for [Windows VMs](../windows/n-series-driver-setup.md).
@@ -53,7 +57,7 @@ Then run installation commands specific for your distribution.
53
57
54
58
### Ubuntu
55
59
56
-
Ubuntu packages NVIDIA proprietary drivers. Those drivers come directly from NVIDIA and are simply packaged by Ubuntu so that they can be automatically managed by the system. Downloading and installing drivers from another source can lead to a broken system. Moreover, installing third-party drivers requires extra-steps on VMs with TrustedLaunch and Secure Boot enabled. They require the user to add a new Machine Owner Key for the system to boot. Drivers from Ubuntu are signed by Canonical and will work with Secure Boot.
60
+
Ubuntu packages NVIDIA proprietary drivers. Those drivers come directly from NVIDIA and are simply packaged by Ubuntu so that they can be automatically managed by the system. Downloading and installing drivers from another source can lead to a broken system. Moreover, installing third-party drivers requires extrasteps on VMs with TrustedLaunch and Secure Boot enabled. They require the user to add a new Machine Owner Key for the system to boot. Drivers from Ubuntu are signed by Canonical and will work with Secure Boot.
57
61
58
62
1. Install `ubuntu-drivers` utility:
59
63
```bash
@@ -69,9 +73,9 @@ Ubuntu packages NVIDIA proprietary drivers. Those drivers come directly from NVI
69
73
```
70
74
4. Download and install the CUDA toolkit from NVIDIA:
71
75
> [!NOTE]
72
-
> The example shows the CUDA package path for Ubuntu 24.04 LTS. Replace the path specific to the version you plan to use.
76
+
> The example shows the CUDA package path for Ubuntu 24.04 LTS. Use the path that's specific to the version you plan to use.
73
77
>
74
-
> Visit the [NVIDIA Download Center](https://developer.download.nvidia.com/compute/cuda/repos/) or the [NVIDIA CUDA Resources page](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network) for the full path specific to each version.
78
+
> Visit the [NVIDIA Download Center](https://developer.download.nvidia.com/compute/cuda/repos/) or the [NVIDIA CUDA Resources page](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=24.04&target_type=deb_network) for the full path that's specific to each version.
@@ -156,7 +160,7 @@ For example, CentOS 8 and RHEL 8 need the following steps.
156
160
sudo yum install cuda
157
161
```
158
162
> [!NOTE]
159
-
> If you see an error message related to missing packages like vulkan-filesystem then you may need to edit /etc/yum.repos.d/rh-cloud, look for optional-rpms and set enabled to 1
163
+
> If you see an error message related to missing packages like vulkan-filesystem, you may need to edit /etc/yum.repos.d/rh-cloud, look for optional-rpms, and set enabled to 1.
160
164
>
161
165
162
166
5. Reboot the VM and proceed to verify the installation.
@@ -217,7 +221,7 @@ To install NVIDIA GRID drivers on NV or NVv3-series VMs, make an SSH connection
3. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NVv2 VMs.) To disable the driver, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following contents:
224
+
3. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NVv2 VMs.) To disable the driver, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following content:
221
225
222
226
```
223
227
blacklist nouveau
@@ -240,7 +244,7 @@ To install NVIDIA GRID drivers on NV or NVv3-series VMs, make an SSH connection
240
244
241
245
6. When you're asked whether you want to run the nvidia-xconfig utility to update your X configuration file, select **Yes**.
242
246
243
-
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/
247
+
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/.
@@ -253,7 +257,7 @@ To install NVIDIA GRID drivers on NV or NVv3-series VMs, make an SSH connection
253
257
EnableUI=FALSE
254
258
```
255
259
256
-
9. Remove the following from `/etc/nvidia/gridd.conf` if it is present:
260
+
9. Remove the following from `/etc/nvidia/gridd.conf` if it's present:
257
261
258
262
```
259
263
FeatureType=0
@@ -278,7 +282,7 @@ The GRID driver installation process does not offer any options to skip kernel m
278
282
sudo yum install hyperv-daemons
279
283
```
280
284
281
-
2. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NV3 VMs.) To do this, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following contents:
285
+
2. Disable the Nouveau kernel driver, which is incompatible with the NVIDIA driver. (Only use the NVIDIA driver on NV or NV3 VMs.) To do this, create a file in `/etc/modprobe.d` named `nouveau.conf` with the following content:
282
286
283
287
```
284
288
blacklist nouveau
@@ -312,7 +316,7 @@ The GRID driver installation process does not offer any options to skip kernel m
312
316
313
317
6. When you're asked whether you want to run the nvidia-xconfig utility to update your X configuration file, select **Yes**.
314
318
315
-
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/
319
+
7. After installation completes, copy /etc/nvidia/gridd.conf.template to a new file gridd.conf at location /etc/nvidia/.
@@ -325,7 +329,7 @@ The GRID driver installation process does not offer any options to skip kernel m
325
329
EnableUI=FALSE
326
330
```
327
331
328
-
9. Remove one line from `/etc/nvidia/gridd.conf` if it is present:
332
+
9. Remove one line from `/etc/nvidia/gridd.conf` if it's present:
329
333
330
334
```
331
335
FeatureType=0
@@ -345,7 +349,7 @@ If the driver is installed, Nvidia SMI will list the **GPU-Util** as 0% until yo
345
349
346
350
347
351
### X11 server
348
-
If you need an X11 server for remote connections to an NV or NVv2 VM, [x11vnc](https://wiki.archlinux.org/title/X11vnc) is recommended because it allows hardware acceleration of graphics. The BusID of the M60 device must be manually added to the X11 configuration file (usually,`etc/X11/xorg.conf`). Add a `"Device"` section similar to the following:
352
+
If you need an X11 server for remote connections to an NV or NVv2 VM, [x11vnc](https://wiki.archlinux.org/title/X11vnc) is recommended because it allows hardware acceleration of graphics. The BusID of the M60 device must be manually added to the X11 configuration file (usually `etc/X11/xorg.conf`). Add a `"Device"` section similar to the following:
349
353
350
354
```
351
355
Section "Device"
@@ -359,7 +363,7 @@ EndSection
359
363
360
364
Additionally, update your `"Screen"` section to use this device.
Then, create an entry for your update script in `/etc/rc.d/rc3.d` so the script is invoked as root on boot.
388
+
Then create an entry for your update script in `/etc/rc.d/rc3.d` so the script is invoked as root on boot.
385
389
386
390
## Troubleshooting
387
391
388
-
* You can set persistence mode using `nvidia-smi` so the output of the command is faster when you need to query cards. To set persistence mode, execute`nvidia-smi -pm 1`. If the VM is restarted, the mode setting goes away. You can always script the mode setting to execute upon startup.
392
+
* You can set persistence mode using `nvidia-smi` so the output of the command is faster when you need to query cards. To set persistence mode, run`nvidia-smi -pm 1`. If the VM is restarted, the mode setting goes away. You can always script the mode setting to run upon startup.
389
393
390
394
* If you updated the NVIDIA CUDA drivers to the latest version and find RDMA connectivity is no longer working, [reinstall the RDMA drivers](#rdma-network-connectivity) to reestablish that connectivity.
391
395
* During installation of LIS, if a certain CentOS/RHEL OS version (or kernel) is not supported for LIS, an error “Unsupported kernel version” is thrown. Report this error along with the OS and kernel versions.
392
396
393
-
* If jobs are interrupted by ECC errors on the GPU (either correctable or uncorrectable), first check to see if the GPU meets any of Nvidia's [RMA criteria for ECC errors](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#faq-pre). If the GPU is eligible for RMA, contact support about getting it serviced; otherwise, reboot your VM to reattach the GPU as described [here](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#bl_reset_reboot). Less invasive methods such as `nvidia-smi -r` don't work with the virtualization solution deployed in Azure.
397
+
* If jobs are interrupted by ECC errors on the GPU (either correctable or uncorrectable), first check to see if the GPU meets any of Nvidia's [RMA criteria for ECC errors](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#faq-pre). If the GPU is eligible for RMA, contact support about getting it serviced; otherwise, reboot your VM to reattach the GPU as described [here](https://docs.nvidia.com/deploy/dynamic-page-retirement/index.html#bl_reset_reboot). Less invasive methods, such as `nvidia-smi -r`, don't work with the virtualization solution deployed in Azure.
0 commit comments