Engineering Manager, DGX Cloud Production Engineering

Nvidia

Remote Actively hiring Posted this week
CA · TX · WA Posted 5 days ago $224,000$356,500 / year

At a glance

AI generated

TL;DR

As an Engineering Manager at NVIDIA DGX Cloud, you will lead a team of software and production engineers responsible for building and operating GPU infrastructure across various environments. Your day-to-day responsibilities include driving execution in areas such as Kubernetes operability, automation, observability, and incident response while partnering with other teams to enhance production readiness. You will define priorities, roadmaps, and staffing needs, coach engineers, and foster a culture of learning and ownership. Ideal candidates have 8+ years of industry experience, including 2+ years in leadership roles, with expertise in reliability engineering, Kubernetes environments, and distributed systems. Strong communication skills and the ability to work effectively across teams are essential, as is hands-on experience with GPU infrastructure and multi-cloud environments.

Skills

Kubernetes GitOps CI/CD Terraform AWS Azure GCP Docker Prometheus Grafana Python Go Bash PostgreSQL Redis Ansible GitHub Jenkins Slack Confluence Zoom

What you'll do

  • Lead a team building and operating DGX Cloud infrastructure in various environments.
  • Drive execution for cluster operations, Kubernetes operability, automation, and observability.
  • Define team priorities, roadmap, staffing, and operational ownership.
  • Partner with cross-functional teams to enhance production readiness and reliability.
  • Build an on-call culture focused on learning, ownership, and durable fixes.

What we're looking for

  • 8+ years of industry experience including 2+ years in engineering leadership roles.
  • Proven track record leading teams focused on production infrastructure, Kubernetes operations, or distributed systems.
  • Deep understanding of reliability engineering, automation, observability, and incident response practices.
  • Strong ability to collaborate across multiple teams and influence without direct authority.
  • Clear communication skills with expertise in prioritization and decision-making under pressure.
  • BS/MS in Computer Science or equivalent practical experience required.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 825 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 813 roles with salary data.

Most-posted roles

View all roles at Nvidia