Capacity Operations and Analytics Manager

Nvidia

Remote Actively hiring Posted this week
Santa Clara, CA Posted 4 days ago $168,000$270,250 / year

At a glance

AI generated

TL;DR

Join NVIDIA as a Capacity Operations Manager and lead large-scale computing operations and planning by managing GPU capacity and other compute resources across various cloud service providers. You will build data models, reporting systems, and dashboards to support infrastructure governance programs, analyze technical needs for GPU capacity, identify performance bottlenecks, and drive resource efficiency initiatives with engineering, finance, and product teams. Develop tooling for the cloud infrastructure and analytics platform using AI techniques to optimize resource usage and collaborate closely with Finance, Product, Service Owners, and Infrastructure Engineering teams to align cloud capacity management with company goals. This role requires a Bachelor's or Master’s degree in Computer Science or related field, 10+ years of experience in cloud computing, proficiency in cloud architecture, statistical modeling, machine learning methodologies, and data analytics tools like Kibana, Grafana, Splunk, Prometheus, Tableau, Plotly. Experience with AWS, Azure, GCP, and OCI is essential.

Skills

AWS Azure GCP OCI Kubernetes Terraform Python SQL AI Machine Learning IaaS PaaS SaaS Kibana Grafana Splunk Prometheus Tableau Plotly CI/CD

What you'll do

  • Manage and optimize GPU capacity across cloud service providers to meet demands.
  • Develop data models and performance metrics supporting NVIDIA’s infrastructure governance.
  • Analyze technical needs for GPU capacity from internal and external teams.
  • Identify and resolve bottlenecks in daily usage of compute resources.
  • Drive resource efficiency initiatives with engineering, finance, and product teams.
  • Enhance tooling for cloud infrastructure to optimize resource usage and performance.

What we're looking for

  • Bachelor’s or Master’s degree in Computer Science or related field with 10+ years of cloud computing experience.
  • Proven track record in managing GPU capacity and large-scale computing operations.
  • Expertise in cloud architecture, development, deployment, and data management across AWS, Azure, GCP, and OCI.
  • Experience using AI tools to extract insights from data for resource optimization.
  • Strong statistical modeling and machine learning skills for operational efficiency.
  • Proficiency with data analytics and visualization tools like Kibana, Grafana, Tableau.
  • Ability to operate effectively in uncertain and rapidly changing business conditions.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 825 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 813 roles with salary data.

Most-posted roles

View all roles at Nvidia