Senior Datacenter Technical Program Manager, At-Scale AI Clusters

Nvidia

Remote

Quick summary

Work type
Remote
Location
Santa Clara, CA
Salary
$168,000–$258,750 / yr
Posted
7 days ago

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $194k
This role $213k
$139k most similar roles pay here $272k

This role pays more than 60% of similar roles. Most pay $152,753–$236,187 — the shaded band above. At the midpoint, this role pays about $213k versus about $194k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 994 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 977 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior Datacenter Technical Program Manager, At-Scale AI Clusters

NVIDIA seeks a Technical Program Manager to join its Applied Systems Engineering Team, focusing on driving datacenter integration for next-generation AI supercomputing systems. This TPM will collaborate across hardware and software teams to build large-scale GPU computing systems, lead the integration of new AI clusters into datacenters with stringent power and cooling requirements, coordinate facility design and fit-out, produce detailed documentation, and communicate with engineering leadership to address critical issues. The ideal candidate has a BS in Applied Science or Engineering, 8+ years of experience, expertise in high-performance computing systems, and familiarity with datacenter design and system monitoring tools like Prometheus, Grafana, and BACNet. Strong teamwork skills are essential for facilitating collaboration among multiple teams in this fast-paced environment.

What you'll do

  • Lead integration of new AI clusters into datacenter environments with stringent power and cooling requirements.
  • Coordinate design and construction of new datacenter facilities for GPU computing systems.
  • Produce comprehensive documentation for datacenter fit-out and system integration processes.
  • Collaborate with engineering leaders to prioritize and resolve critical issues for large-scale deployments.
  • Drive the development of reference architectures for AI supercomputing systems.

What we're looking for

  • 8+ years of experience in technical roles related to hardware/software systems.
  • BS in Applied Science or Engineering (or equivalent).
  • Extensive experience with GPU clusters and HPC systems in datacenters.
  • Strong problem-solving skills for complex technical challenges.
  • Proven ability to collaborate effectively across multiple engineering teams.
  • Deep understanding of datacenter design, including power and cooling technologies.
  • Expertise in system monitoring tools like Prometheus, Grafana, Splunk.

More like this

Similar roles

Datacenter AI Systems and Solutions Engineer, Sr Staff

Qualcomm

San Diego, CA 5 days ago $162,600$244,000
Python Docker Kubernetes MLOps GitOps CI/CD Prometheus Grafana PostgreSQL Redis Slurm Apache Kafka OpenAPI Swagger Terraform Ansible Jenkins GitHub GitLab Bitbucket Travis CI CircleCI

Senior Manager, Data Center Facilities Development

Oracle

Abilene, TX 59 days ago $120,100$251,600
Oracle Cloud Infrastructure Data Center Construction Project Management CI/CD Budget Management Risk Management Vendor Management Regulatory Compliance MEP Infrastructure High Density Liquid Cooling Base Building Data Center Construction Problem Solving Strategic Planning