Principal Software Engineer - DGX Cloud
Nvidia
At a glance
AI generatedNVIDIA’s DGX Cloud team seeks Principal Software Engineers to lead technical direction in Kubernetes-based operations, automation, and reliability for large-scale GPU clusters across internal and cloud partner environments. This senior role involves defining the architecture for cluster lifecycle management, validation, repair, upgrades, observability, and readiness, while establishing patterns for Kubernetes-based GPU cluster operations. Key responsibilities include reducing operational overhead through software and automation, setting technical standards for production readiness, mentoring engineers, and influencing cross-functional teams. Ideal candidates have over 15 years of experience in building and operating large-scale distributed systems or cloud infrastructure, with expertise in Kubernetes, Linux, Go, Python, and production operations. Experience with GPU clusters, AI/ML infrastructure, GitOps, and multi-cloud fleet operations is a plus.
Skills
What you'll do
What we're looking for
Market check
This $272,000–$431,250 range sits above 100% of similar postings on FindRole.
Peer median band
$138,060–$226,000
Median floor and ceiling across peers.
Typical midpoint (25–75%)
$151,875–$215,850
Middle half of comparable postings.
Based on 240 comparable postings.
* 240 is the maximum number of comparable postings sampled.
Employer
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 801 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.
Most-posted roles
More like this
Nvidia
Nvidia
Nvidia
Nvidia
Nvidia
Nvidia