Technical Program Manager, Cloud Infrastructure

Nvidia

Hybrid Actively hiring
Santa Clara, CA · Seattle, WA Posted 21 days ago $168,000$258,750 / year

At a glance

AI generated

TL;DR

NVIDIA’s DGX Cloud team is seeking a Technical Program Manager (TPM) to drive critical programs related to AI capacity enablement and management. This TPM will collaborate with engineering, infrastructure, and software teams internally, as well as CSPs and NCPs externally, to build AI capacity and infrastructure globally. Key responsibilities include defining requirements for cloud service providers, developing comprehensive roadmaps, managing ongoing capacity operations, and leveraging Jira for program management. The ideal candidate has over 10 years of technical program management experience in large-scale cloud infrastructure projects, with expertise in cloud technologies like Kubernetes and Terraform, and a deep understanding of NVIDIA GPU products. Strong proficiency in Jira and strategic thinking are essential, along with excellent communication skills for executive audiences.

Skills

Jira Kubernetes Terraform API integration CI/CD AWS Azure GCP PostgreSQL Docker Prometheus Grafana GitLab Python NVIDIA GPUs Cloud-native environments AI infrastructure ML infrastructure

What you'll do

  • Define and communicate technical requirements to CSPs and NCPs for AI capacity.
  • Drive alignment with external partners on managed storage, network solutions, and NVIDIA’s roadmap.
  • Manage ongoing capacity operations and engineering engagement with cloud providers.
  • Identify opportunities for third-party and in-house cloud infrastructure solution adoption.
  • Establish KPIs and demonstrate the value delivered by programs quantitatively.

What we're looking for

  • 10+ years of technical program management experience in large-scale cloud infrastructure programs.
  • Extensive hands-on experience in cloud infrastructure, preferably from a major Cloud Service Provider (CSP).
  • Expert-level proficiency with Jira or similar program management tools and ability to guide engineering teams.
  • Domain knowledge in bring-up and end-to-end operations of compute, storage, and GPU systems.
  • Outstanding strategic and tactical thinking abilities with strong consensus-building skills.
  • BS or MS in Electrical Engineering or Computer Science, or equivalent experience.
  • In-depth knowledge of NVIDIA GPU products, including deployment and bring-up processes.

Market check

Salary context

This $168,000–$258,750 range sits above 77% of similar postings on FindRole.

Peer median band

$127,666$225,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$143,012$213,375

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Technical Program Manager, Cloud Infrastructure

Nvidia

Us, Ca, Santa Clara, US 28 days ago $168,000$258,750
Jira Kubernetes Terraform API integration CI/CD NVIDIA GPU products Cloud Service Providers PostgreSQL Python Docker AWS Azure Grafana Prometheus Scrum DevOps

Senior Technical Program Manager, Cloud Infrastructure

Nvidia

Us, Ca, Santa Clara, US 21 days ago $200,000$322,000
Jira Kubernetes Terraform API integration Python CI/CD Prometheus Grafana NVIDIA GPU products AWS Azure Google Cloud Platform PostgreSQL Docker Git Scrum Agile methodologies

Senior Technical Program Manager, DGX Cloud Software Products and Services

Nvidia

Us, Ca, Santa Clara, US 25 days ago $168,000$258,750
Jira Aha! Confluence Git Distributed version control systems Reliability engineering Resilience development Service performance metrics Goodput Efficiency Utilization Distributed training frameworks Checkpointing NCCL Slurm AI infrastructure Large-scale compute platforms CI/CD

Senior Technical Program Manager, Software Compute Platform

Nvidia

Us, Ca, Santa Clara, US 49 days ago $200,000$322,000
Python Java C++ Git Jenkins Black_Duck Palamida Docker Kubernetes AWS CI/CD PostgreSQL Linux OSS_profiling Version_Control Release_Management Test_Plans Automation_Scripts

Sr Staff Engineer, Cloud Infrastructure

Gap Inc

Remote (Sf - 2 Folsom, US) 98 days ago
GitHub Terraform Git GitHub Actions Docker Kubernetes CI/CD Azure GCP GitHub Enterprise Cloud Prometheus Grafana Python Go
Remote