Replies: 2 comments 1 reply
-
|
I don't think we published nvidia driver sysexts for 3815.2.5 - are you sure that's what you're running? Where are you sourcing the As for the device nodes: we do the device node creation from our It would also be useful if you could provide a reproducer - how do you deploy the nvidia device plugin, how exactly is the node setup, etc. |
Beta Was this translation helpful? Give feedback.
-
|
@jepio i am getting the following error, this is how i install erorrs im getting: variant: flatcar
version: 1.0.0
storage:
files:
# Enable built-in NVIDIA kernel drivers
- path: /etc/flatcar/enabled-sysext.conf
mode: 0644
contents:
inline: |
nvidia-drivers-570-open
# Download official sysext-bakery extensions
- path: /etc/extensions/kubernetes.raw
mode: 0644
contents:
source: https://extensions.flatcar.org/extensions/kubernetes-v1.33.3-x86-64.raw
- path: /etc/extensions/nvidia-runtime.raw
mode: 0644
contents:
source: https://extensions.flatcar.org/extensions/nvidia-runtime-v1.17.8-x86-64.raw
- path: /etc/extensions/cilium.raw
mode: 0644
contents:
source: https://extensions.flatcar.org/extensions/cilium-v0.18.5-x86-64.raw
- path: /etc/extensions/ollama.raw
mode: 0644
contents:
source: https://extensions.flatcar.org/extensions/ollama-v0.9.6-x86-64.raw
# Static IP network configuration
- path: /etc/systemd/network/00-enp5s0f0.network
mode: 0644
contents:
inline: |
[Match]
Name=enp5s0f0
[Network]
DHCP=no
Address=192.168.1.240/24
Gateway=192.168.1.1
DNS=8.8.8.8
DNS=8.8.4.4
systemd:
units:
# Ensure systemd-networkd is enabled
- name: systemd-networkd.service
enabled: true
# Disable automatic reboots
- name: locksmithd.service
enabled: false
mask: true
# Configure Ollama to listen on localhost only
- name: ollama.service
enabled: true
dropins:
- name: 10-ollama-env-override.conf
contents: |
[Service]
Environment="OLLAMA_HOST=127.0.0.1:11434"
# Initialize Kubernetes using Flatcar's recommended approach
- name: kubeadm.service
enabled: true
contents: |
[Unit]
Description=Kubeadm service
Requires=containerd.service
After=containerd.service network-online.target
ConditionPathExists=!/etc/kubernetes/kubelet.conf
[Service]
ExecStartPre=/usr/bin/kubeadm init --pod-network-cidr=10.244.0.0/16
ExecStartPre=/usr/bin/mkdir -p /home/core/.kube
ExecStartPre=/usr/bin/cp /etc/kubernetes/admin.conf /home/core/.kube/config
ExecStart=/usr/bin/chown -R core:core /home/core/.kube
[Install]
WantedBy=multi-user.target
# Configure Cilium CNI
- name: cilium.service
enabled: true
dropins:
- name: 10-cilium-env-override.conf
contents: |
[Service]
Environment="CILIUM_INSTALL_ARGS=--set kubeProxyReplacement=true --namespace=kube-system"
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
NVIDIA Device Plugin Fails on Flatcar Container Linux with SystemD Extensions
Issue Summary
NVIDIA device plugin fails to detect GPUs on Flatcar Container Linux when NVIDIA drivers are installed via systemd-sysext, despite nvidia-smi working correctly.
Environment
Problem Description
nvidia-smishows GPU)Pendingstate/dev/nvidia*Root Cause Analysis
1. Missing nvidia-container-runtime Binaries
The nvidia-container-toolkit sysext only provides wrapper scripts, not actual binaries:
2. CDI (Container Device Interface) Issues
3. Device Node Creation Timing
Reproduction Steps
kubectl describe node | grep nvidia.com/gpushows no GPU resourcesWorkaround Attempted (Partial Success)
Expected Behavior
NVIDIA device plugin should:
nvidia.com/gpuresources in KubernetesActual Behavior
nvidia.com/gpuresources availablePotential Solutions
Related Issues
Impact
This blocks GPU workload deployment on Flatcar Container Linux, forcing users to switch to traditional distributions for NVIDIA GPU support.
Beta Was this translation helpful? Give feedback.
All reactions