Home / BeaverDeck / Docs / Insights Guide / GPU Insights

GPU Insights

GPU discovery, allocation pressure, placement, quota, fragmentation, and expensive-capacity usage.

Permissions: viewing checks requires insights: view. Opening a linked object or logs requires the corresponding resource permission, and the BeaverDeck ServiceAccount must be allowed to read the Kubernetes resources used by the check. Suppressing a finding requires insights: edit and affects all users.

Data Evaluated

Node nvidia.com/gpu allocatable capacity, active pod GPU requests, scheduling state, selected namespaces, and ResourceQuotas.

Checks

Check	When it reports	Alert severity
GPU Capacity Discovery `gpu-capacity-discovery`	Selected namespaces contain active GPU requests, but no Node advertises allocatable `nvidia.com/gpu` capacity.	Critical
GPU Allocation Pressure `gpu-allocation-pressure`	Active pod GPU requests reach at least 80% of allocatable GPUs on a GPU node or across the selected namespaces. The severity becomes critical at 95%.	Warning at 80%; critical at 95%
GPU Node Scheduling `gpu-node-cordoned`	A Node advertises GPU capacity and is marked unschedulable.	Warning
GPU Idle Allocation `gpu-node-idle-allocation`	A Node advertises GPU capacity, but no active pod in the selected namespaces requests a GPU on that node.	Warning
GPU Node Workload Mix `non-gpu-pods-on-gpu-node`	An active, non-DaemonSet Pod without a GPU request is scheduled on a GPU node.	Warning
GPU Fragmentation `gpu-fragmentation`	A GPU Pod has been Pending for at least 5 minutes, total free GPU capacity across schedulable GPU nodes is sufficient, but no single node has enough free GPUs for that Pod.	Warning
Namespace GPU Usage `gpu-namespace-usage`	A selected namespace has one or more active GPU workload requests. BeaverDeck reports the total requested GPU count and whether a GPU quota exists.	Informational passing check
GPU Quota `gpu-quota`	A selected namespace has active GPU requests but no ResourceQuota hard limit for `requests.nvidia.com/gpu`, `limits.nvidia.com/gpu`, or `nvidia.com/gpu`.	Warning
GPU Pod Pending `gpu-pod-pending`	An active Pod requests GPUs and remains Pending for at least 5 minutes.	Warning
GPU Pod Readiness `gpu-pod-unready`	A GPU-requesting Pod is assigned to a node but remains not Ready for at least 10 minutes from its Ready-condition transition or creation time.	Warning
GPU Pod Requests `gpu-pod-requests`	An active GPU-requesting Pod has an init or application container without a CPU request or memory request.	Warning

Open an individual check for risk context, recommended response, and limitations. Passing checks are visible when Show all checks is enabled in BeaverDeck.