Technical Product Manager - AI Infra Resilience

Nvidia

Quick summary

Work type
On-site
Location
Santa Clara, CANew York, NYSeattle, WA
Salary
$208,000–$327,750 / yr
Posted
3 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $206k
This role $268k
$148k most similar roles pay here $347k

This role pays more than 90% of similar roles. Most pay $168,960–$244,025 — the shaded band above. At the midpoint, this role pays about $268k versus about $206k for comparable roles.

Based on 239 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 994 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 977 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Technical Product Manager - AI Infra Resilience

NVIDIA seeks a Technical Product Manager to lead the development of its AI factory resilience platform, focusing on creating robust underpinnings and integrations that ensure adaptability across various architectures and workloads. This role involves defining product roadmaps, designing features like telemetry interfaces and observability APIs, and fostering collaboration within the open-source community through GitHub interactions. The ideal candidate has over 12 years of experience in technical product management or software engineering, with deep expertise in data center operations, GPU infrastructure, Kubernetes, and developer platforms. They should excel at translating customer requirements into strategic initiatives while effectively communicating across diverse teams from developers to executives, and have a track record of delivering scalable APIs and contributing to open-source projects.

What you'll do

  • Define the resilience platform's product roadmap and deliver specific features.
  • Convert developer pain points into actionable features and integration partnerships.
  • Prioritize GitHub issues and gather feedback from the open-source community.
  • Design, prioritize, and execute feature development with engineering teams.
  • Align cross-functionally on requirements, roadmaps, messaging, and engagements.

What we're looking for

  • 12+ years of experience in product management, solutions architecture, or software engineering.
  • Bachelor's degree in Computer Science or equivalent technical experience.
  • Technical expertise in data center operations, GPU infrastructure, and container orchestration.
  • Ability to translate customer requirements into product strategy for senior technical stakeholders.
  • Experience building products for data center infrastructure operations and observability.
  • Practical experience contributing to open-source projects on GitHub.
  • Strong communication skills across developer and executive audiences.

More like this

Similar roles

Senior Product Manager, AI Factory Infra

Nvidia

Santa Clara, CA +2 19 days ago $208,000$327,750
AWS Kubernetes Terraform Docker CI/CD Prometheus Grafana Python PostgreSQL MLOps GPU Reliability Engineering SLO Chaos Engineering Agentic AI Workflow Orchestration RMA Vendor SLA Oversight
Hybrid

Product Manager, AI Infrastructure

Arm Holdings

San Jose, California 51 days ago $211,600$286,200
AI ML Datacenter Cloud Product Management Systems Thinking Competitive Analysis GenAI ML Accelerators Hardware Software Infrastructure Performance Analysis System Layers End-to-End System Behavior Problem Solving Technical Analysis Customer Workloads Product Roadmap Engineering Priorities
Hybrid

AI Product Manager

Blackline

New York, NY 4 days ago $133,000$133,000
SAP NetSuite Sage Intacct ERP Python SQL PostgreSQL CI/CD AWS Kubernetes Docker Terraform Prometheus Grafana
Hybrid

Principal Technical Product Manager - Agentic AI

Intuit

San Diego, CA +1 59 days ago $243,000$328,500
LLM Agentic Systems Workflow Orchestration API Contracts State Management CI/CD Observability SDKs Docker Kubernetes Python PostgreSQL AWS GCP Azure Prometheus Grafana
Hybrid

Senior Applied AI Product Manager

Adobe

San Jose +1 14 days ago $165,600$239,725
SQL LLM Agentic architectures API design Figma FigJam Miro Jira Confluence Git CI/CD Python PostgreSQL AWS Kubernetes

Lead Product Manager, AI Platform

The Walt Disney Company

Remote (Usa - Fl - Kirkman Point 1, US) 6 days ago $148,300$198,800
Python LLM-powered applications Prompt engineering Agent frameworks Evaluation pipelines Retrieval-augmented generation LiteLLM OpenRouter Bedrock n8n LangGraph Temporal Arize Phoenix Langfuse MCP OpenWebUI CI/CD AI observability tooling
Remote