Senior Systems Software Engineer, Kubernetes Node Lifecycle - DGX Cloud

Nvidia

Quick summary

Work type
On-site
Location
Santa Clara, CASeattle, WA
Salary
$184,000–$287,500 / yr
Posted
6 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $199k
This role $236k
$147k most similar roles pay here $303k

This role pays more than 80% of similar roles. Most pay $162,000–$235,750 — the shaded band above. At the midpoint, this role pays about $236k versus about $199k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 980 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 966 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior Systems Software Engineer, Kubernetes Node Lifecycle - DGX Cloud

As a Senior Systems Software Engineer at NVIDIA’s DGX Cloud division, you will join a team dedicated to advancing AI computing solutions on a global scale. Your primary responsibilities include directing the development of CAPI providers for NVIDIA Kubernetes Engine (NKE), ensuring consistent node provisioning across diverse environments, and maintaining OS image build pipelines that meet enterprise security standards. You will also develop automated test suites and manage large-scale nodepool lifecycles, addressing production issues and collaborating with upstream communities to set industry standards. The role requires deep expertise in Kubernetes node engineering, Cluster API, and cloud infrastructure, along with proficiency in Golang or Python and experience with major public clouds. Your work will significantly impact NVIDIA’s ability to support internal researchers and enable NCPs through cutting-edge technology solutions.

What you'll do

  • Direct the building and refinement of CAPI providers for NVIDIA Kubernetes Engine.
  • Develop bring-your-own-node workflows for integrating diverse NVIDIA hardware into NKE clusters.
  • Coordinate OS image generation, packaging, deployment, and update processes for NKE nodes.
  • Develop automated test suites for node images to verify accuracy across Kubernetes versions.
  • Handle nodepool lifecycle at scale, including provisioning, upgrades, and seamless node replacement.
  • Examine and resolve underlying causes of node-layer faults in production NKE clusters.
  • Partner with upstream communities to establish node provisioning and lifecycle standards.

What we're looking for

  • 8+ years of experience in systems software, cloud infrastructure, or Kubernetes node engineering.
  • Deep expertise in Cluster API (CAPI) for full machine lifecycle management.
  • Extensive experience with OS image build pipelines and delivery systems for Kubernetes nodes.
  • Practical experience with bring-your-own-node models and large-scale nodepool lifecycle management.
  • Strong understanding of kubelet configuration and the Kubernetes node registration lifecycle.
  • Experience with node image security, including vulnerability scanning and compliance gating.

More like this

Similar roles

Senior Software Engineer - Cloud and Kubernetes

Nvidia

Remote (Santa Clara, CA) 46 days ago $184,000$287,500
Kubernetes Go C++ CI/CD Jenkins GitLab GitHub Python Rust Docker Prometheus Grafana NVIDIA GPUs ConnectX BlueField NICs HPC AIInfrastructure Networking
Remote

Senior Kubernetes Software Engineer

Broadcom

Palo Alto, CA 69 days ago $120,000$192,000
Kubernetes Go CNCF CI/CD vSphere Docker Terraform AWS GCP Azure PostgreSQL Prometheus GitLab GitHub Maven Jenkins Ansible Python Shell_scripting

Senior System Software Engineer, Kubernetes and KubeVirt

Nvidia

Remote (Santa Clara, CA) 131 days ago $184,000$287,500
Kubernetes KubeVirt Go CI/CD REST gRPC Docker APIs Cloud Infrastructure Virtualization Container Orchestration Load Balancing Security Multi-Tenant Cloud Platforms AI-Assisted Development Tools CNCF/Open Source Projects Device Plugins
Remote

Senior Kubernetes Platform Engineer

Oracle

Austin, TX 5 days ago $96,800$223,400
Kubernetes Go gRPC protobuf Helm Cilium Linkerd Traefik Istio Gatekeeper OCI Terraform Docker CI/CD Linux TCP/IP TLS