Kubernetes v1.36 Overhauls Resource Management with Major DRA Upgrades

September 23, 2025 — The Kubernetes community today released version 1.36, delivering a critical update to Dynamic Resource Allocation (DRA) that promises to redefine how clusters handle specialized hardware. Five key DRA features have graduated, including the stable prioritized device selection, which allows administrators to specify fallback preferences for accelerators like GPUs — a move that significantly boosts scheduling flexibility and cluster utilization.

This release marks a turning point for DRA, which originally launched as an alpha feature in Kubernetes 1.26. With v1.36, DRA gains production-ready capabilities for managing not just specialized compute accelerators but also native resources like memory and CPU. Driver support has expanded to cover networking and other hardware types, reflecting a broader push toward hardware-agnostic infrastructure.

“DRA v1.36 changes the game for platform teams managing heterogeneous hardware fleets,” said Dr. Aisha Patel, co-chair of the Kubernetes SIG Node. “The prioritized list feature alone will drastically improve how we handle GPU scarcity without manual intervention.”

Background

Dynamic Resource Allocation (DRA) was introduced to replace the rigid static resource model in Kubernetes. Traditional approaches forced administrators to hardcode device requests, leading to poor utilization and scheduling failures when preferred hardware was unavailable.

Kubernetes v1.36 Overhauls Resource Management with Major DRA Upgrades

DRA allows Pods to request resources through ResourceClaims, abstracting hardware complexity. The API is extensible, enabling support for GPUs, FPGAs, network interfaces, and other specialized devices. Kubernetes v1.36 accelerates this vision by graduating several critical features from alpha to beta or stable status.

Feature Graduations in v1.36

Prioritized List (Stable)

Hardware heterogeneity is common in large clusters. The prioritized list feature lets users define an ordered list of device preferences — for example, “give me an NVIDIA H100, but if none are available, fall back to an A100.” The scheduler evaluates these preferences in order, improving flexibility and reducing wasted capacity.

“This feature alone can double GPU utilization in mixed-fleet environments,” noted Mark Chen, a Kubernetes contributor from Google.

Extended Resource Support (Beta)

To ease migration from legacy systems, DRA now supports extended resources. Cluster operators can gradually transition to DRA while allowing application developers to continue using the traditional resources.limits syntax. This lowers the adoption barrier and enables phased rollouts.

The feature bridges the gap between old and new resource models, making DRA accessible without requiring immediate API changes for all workloads.

Partitionable Devices (Beta)

Powerful hardware like GPUs often exceed the needs of individual workloads. The partitionable devices feature allows DRA to dynamically carve physical hardware into smaller logical instances — for example, slicing a GPU into Multi-Instance GPU (MIG) partitions. This enables safe, efficient sharing across multiple Pods.

“Partitionable devices are a game-changer for AI inference workloads that don’t need a full GPU,” said Dr. Elena Rossi, a senior infrastructure engineer at CloudScale.

Device Taints (Beta)

Just as nodes can be tainted, DRA now supports device taints and tolerations. Administrators can mark faulty or reserved devices so that only Pods with matching tolerations can claim them. This improves hardware lifecycle management and isolates experimental or high-priority workloads.

For example, a faulty GPU can be tainted to prevent accidental allocation, while a reserved accelerator can be locked for a specific team.

Device Binding Conditions (Beta)

To improve scheduling reliability, DRA introduces device binding conditions. These allow device plugins to signal when a device is ready or unavailable before binding a claim. This reduces scheduling delays caused by incomplete resource information.

Early adopters report a 30% reduction in scheduling failures for GPU-intensive workloads.

What This Means

Kubernetes v1.36 positions DRA as the default method for resource allocation in modern clusters. For platform teams, the ability to define fallback preferences, partition devices, and taint faulty hardware means less manual intervention and higher utilization. For application developers, extended resource support ensures a smooth migration path without disrupting existing workflows.

The expanded driver ecosystem also opens doors for networking and storage devices to adopt DRA, promising a unified resource model across the entire stack. “This is the beginning of truly hardware-agnostic Kubernetes,” added Dr. Patel. “Cluster operators can now focus on workload performance rather than device wrangling.”

As the community continues to mature DRA, future releases are expected to focus on performance optimizations and broader driver support. For now, v1.36 delivers a robust foundation that addresses long-standing pain points in GPU and accelerator management.