Senior Technical Product Manager – DGX Enterprise Infrastructure and Cloud-Native Operations

Nvidia

Actively hiring
Santa Clara, US Posted 18 days ago $208,000$327,750 / year

At a glance

AI generated

TL;DR

NVIDIA seeks a Senior Product Manager to architect the operational future of Enterprise AI by transforming raw DGX hardware into high-availability, self-healing AI Factories. This role involves setting the vision for deploying, managing, and scaling enterprise AI deployments, from bare-metal provisioning to automated firmware rollouts. Key responsibilities include developing telemetry and diagnostic suites, integrating DGX systems with Kubernetes, standardizing deployment processes through APIs and services, and driving predictive operations to ensure peak performance without manual intervention. Ideal candidates have over 12 years of experience in product management for on-premise infrastructure, expertise in cloud-native technologies like Kubernetes, and a deep understanding of data center networking and storage architectures. They should also possess leadership skills and be familiar with automation tools such as Ansible and Terraform, while having a vision for using AI to manage AI infrastructure effectively.

Skills

Kubernetes Terraform Python Ansible Prometheus Grafana InfiniBand Linux APIs CI/CD MIG Docker PostgreSQL AIOps Infrastructure-as-Code Pulumi GitOps

What you'll do

  • Define the "Day 0 through Day 2" experience for DGX SuperPODs, including provisioning and firmware rollouts.
  • Develop a telemetry and diagnostic suite to instantly isolate issues in private data centers.
  • Lead integration of DGX systems into cloud-native ecosystems like Kubernetes.
  • Standardize enterprise DGX deployments with APIs and services to eliminate management snowflakes.
  • Drive predictive operations features for automated health checks and self-healing infrastructure.

What we're looking for

  • Over 12 years of product management experience in on-premise infrastructure or private cloud environments.
  • Bachelor’s degree in Computer Science or equivalent technical field.
  • Proven ability to transform complex hardware operations into software-defined workflows.
  • Expertise in Kubernetes operators and container orchestration for large-scale systems.
  • Extensive experience managing Linux fleets in air-gapped enterprise data centers.
  • Deep knowledge of data center networking, storage architectures, and firmware-to-OS integration.
  • Vision for using AI (AIOps) to predict and prevent infrastructure failures.

Market check

Salary context

This $208,000–$327,750 range sits above 87% of similar postings on FindRole.

Peer median band

$168,000$255,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$173,718$244,875

Middle half of comparable postings.

Based on 238 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Technical Program Manager, DGX Cloud Software Products and Services

Nvidia

Us, Ca, Santa Clara, US 25 days ago $168,000$258,750
Jira Aha! Confluence Git Distributed version control systems Reliability engineering Resilience development Service performance metrics Goodput Efficiency Utilization Distributed training frameworks Checkpointing NCCL Slurm AI infrastructure Large-scale compute platforms CI/CD

Senior Product Manager - Hardware Infrastructure

Nvidia

Us, Ca, Santa Clara, US 32 days ago $168,000$258,750
Git Perforce Kubernetes Terraform AWS CI/CD Python PostgreSQL Docker Prometheus Grafana AI ML LLM Scalability Reliability Code Integrity Versioning Standards Cross-functional Alignment Large-scale Infra Deployments

Senior Manager, DGX Cloud Technical Program Management

Nvidia

Us, Ca, Santa Clara, US 25 days ago $240,000$379,500
Grafana Prometheus Kubernetes AWS Azure CI/CD Docker Python PostgreSQL Terraform GitLab Jenkins Ansible NVIDIA GPU AI/ML platforms observability telemetry cloud infrastructure distributed systems security compliance

Senior Technical Product Manager

Booz Allen Hamilton

Locations Mclean, Virginia, US 18 days ago $99,000$225,000
Agile SaaS CI/CD Mitre ATT&CK SIEM EDR Python Go Ruby Java Kubernetes Docker AWS Azure GCP PostgreSQL MongoDB Redis Git Jira Confluence Terraform Ansible

Senior Technical Product Manager

GE Aerospace

Overland Park, US 17 days ago
SQL Python R ETL Cloud-based applications Relational database modeling Data Integration Tools IDMC data lakes data warehouses analytics platforms AWS Azure GCP