Technical Program Manager, Cloud Infrastructure

Nvidia

Hybrid

Quick summary

Work type
Hybrid
Location
Santa Clara, CASeattle, WA
Salary
$168,000–$258,750 / yr
Posted
9 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $179k
This role $213k
$116k most similar roles pay here $274k

This role pays more than 77% of similar roles. Most pay $145,000–$213,375 — the shaded band above. At the midpoint, this role pays about $213k versus about $179k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 980 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 966 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Technical Program Manager, Cloud Infrastructure

NVIDIA’s DGX Cloud team is looking for a Technical Program Manager (TPM) to drive critical programs related to AI capacity enablement and management. This TPM will collaborate with engineering, infrastructure, and software teams internally, as well as CSPs and NCPs externally, to build global AI infrastructure. Key responsibilities include defining requirements for cloud service providers, managing capacity operations, and ensuring adherence to product lifecycle processes using Jira. The ideal candidate has over 10 years of technical program management experience in large-scale cloud infrastructure projects, with expertise in GPU bring-up and end-to-end operations. Proficiency in tools like Jira and knowledge of Kubernetes, Terraform, and AI/ML infrastructure are essential, along with strong strategic thinking and communication skills to drive consensus and process improvements within a dynamic environment.

What you'll do

  • Define and communicate technical requirements to CSPs and NCPs for AI capacity.
  • Drive alignment with external partners on managed storage and network solutions.
  • Develop comprehensive roadmaps and establish clear milestones for engineering teams.
  • Manage ongoing capacity operations, focusing on availability and maintenance metrics.
  • Partner within NVIDIA to understand workload needs and optimize infrastructure readiness.
  • Identify opportunities to onboard third-party cloud infrastructure solutions for DGX Cloud.
  • Establish KPIs and demonstrate the value delivered by programs quantitatively.

What we're looking for

  • 10+ years of technical program management experience driving large-scale cloud infrastructure programs.
  • Extensive hands-on experience in cloud infrastructure, preferably from a major Cloud Service Provider (CSP).
  • Expert-level proficiency with Jira and other program management tools to guide engineering teams.
  • Deep domain knowledge in bring-up and end-to-end operations of compute, storage, and GPU systems.
  • Outstanding strategic and tactical thinking abilities with strong consensus-building skills.
  • BS or MS in Electrical Engineering or Computer Science, or equivalent experience.
  • In-depth knowledge of NVIDIA GPU products, including deployment and bring-up processes.

More like this

Similar roles

Senior Technical Program Manager, Cloud Infrastructure

Nvidia

Santa Clara, CA +1 2 days ago $168,000$258,750
Jira Kubernetes Terraform Docker CI/CD Prometheus Grafana Python PostgreSQL AWS Azure NVIDIA GPUs HPC Cloud-Native Architectures Scrum Agile DevOps Infrastructure Automation Hardware Validation Remote Fleet Bootstrapping

Senior Technical Program Manager, Cloud Infrastructure NPI

Nvidia

Santa Clara, CA +1 16 days ago $168,000$258,750
Kubernetes CI/CD JIRA Confluence GPU AWS Azure Grafana Prometheus Terraform Python PostgreSQL Docker Ansible GitLab New Product Introduction (NPI) AI infrastructure Process automation Observability Health check frameworks

Senior Technical Program Manager, DGX Cloud Software Products and Services

Nvidia

Santa Clara, CA 44 days ago $168,000$258,750
Jira Aha! Confluence Git Distributed version control systems Reliability engineering Resilience development Service performance metrics Goodput Efficiency Utilization Distributed training frameworks Checkpointing NCCL Slurm AI infrastructure Large-scale compute platforms CI/CD