Director, Software Engineering (Site Reliability Engineering)
Affirm
At a glance
AI generatedAs a Site Reliability and Software Engineering leader in NVIDIA's DGXC Cloud Reliability organization, you will manage a team of engineers responsible for the software, automation, and operations of multi-colo distributed GPU cloud clusters. Your role involves contributing to product strategy, growing your team, and ensuring operational excellence through scalable SDLC practices and modern methodologies. You will work closely with project management teams to drive technical projects and provide leadership in an innovative environment, focusing on delivering reliable systems both internally and externally. The ideal candidate has over 12 years of engineering experience, including at least five years in leadership roles, with expertise in designing large-scale distributed systems and managing DevOps teams. Strong knowledge in Unix/Linux, containerization, virtualization, and cluster solutions is essential, along with the ability to influence cross-functional partners and mentor team members effectively.
Skills
What you'll do
What we're looking for
Market check
This $320,000–$488,750 range sits above 99% of similar postings on FindRole.
Peer median band
$162,900–$257,250
Median floor and ceiling across peers.
Typical midpoint (25–75%)
$170,000–$244,000
Middle half of comparable postings.
Based on 240 comparable postings.
* 240 is the maximum number of comparable postings sampled.
Employer
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 801 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.
Most-posted roles
More like this
Affirm
Blackstone Inc
CVS Health
McDonald’s Corporation
Nvidia
Nvidia