Senior AI/ML Capacity and Performance Engineer

General Motors (GM)

Hybrid

Quick summary

Work type
Hybrid
Location
Sunnyvale, CA · Seattle, WA
Salary
$144,700–$261,300 / yr
Posted
62 days ago

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $216k
This role $203k
$131k most similar roles pay here $277k

This role pays less than 67% of similar roles. Most pay $186,200–$246,150 — the shaded band above. At the midpoint, this role pays about $203k versus about $216k for comparable roles.

Based on 240 similar postings.

Employer

About General Motors (GM)

General Motors (GM) is a leading American multinational automotive corporation founded in 1908 and headquartered in Detroit, Michigan.

General Motors (GM) currently has 126 open roles on FindRole.

Listed pay typically runs $170,000–$258,500 across 75 roles with salary data.

Most-posted roles

View all roles at General Motors (GM)

At a glance

TL;DR · Senior AI/ML Capacity and Performance Engineer

GM is seeking a Senior Performance Engineer to join the AV Capacity and Performance Engineering team within the AV Infrastructure organization. This role involves strategic infrastructure development for long-term GPU system strategy, performance optimization through deep analysis of production workloads, cross-functional collaboration with AI/ML Research and Cloud Vendors, and proactive system scaling to enhance engineering velocity and cost-efficiency. The ideal candidate has expert-level Python coding skills, proficiency in PyTorch, hands-on experience with Kubernetes, and technical expertise with Nvidia DCGM, nvidia-smi, and Grafana for GPU monitoring. Additionally, the role requires extensive knowledge of AWS, GCP, or Azure cloud platforms, advanced experience in deploying open-source models via Hugging Face, and proficiency in BigQuery for data analytics. The candidate should also have a deep understanding of distributed systems and high-performance computing (HPC) architectures, including Nvidia's latest GPU technologies like the H100, B200, and GB200.

What you'll do

  • Conduct deep-dive analyses to identify and resolve bottlenecks in production workloads.
  • Develop strategic GPU system plans for long-term infrastructure scalability.
  • Collaborate with cross-functional teams to enhance engineering velocity and cost-efficiency.
  • Architect improvements to ensure the reliability of large-scale ML training environments.
  • Utilize Nvidia DCGM, nvidia-smi, and Grafana for real-time monitoring and observability.
  • Provide capacity planning expertise to support GM’s autonomous vehicle development efforts.

What we're looking for

  • 5+ years of professional experience in high-scale infrastructure or ML systems.
  • Bachelor’s Degree in Computer Science or related technical field.
  • Expert-level coding skills in Python and PyTorch ecosystem.
  • Proven track record resolving performance issues in large-scale distributed environments.
  • Deep understanding of modern ML system design and HPC, hands-on Kubernetes experience.
  • Technical proficiency with Nvidia DCGM, nvidia-smi, and Grafana for GPU monitoring.
  • Extensive experience working within major cloud ecosystems (AWS, GCP, or Azure).

More like this

Similar roles

Senior AI/ML Capacity Engineer

General Motors (GM)

Remote (Gm Automation - Sunnyvale - Gm Automation - Sunnyvale, US) 73 days ago $144,700$261,300
Python Pandas NumPy BigQuery Looker Git Linux GCP CI/CD Docker Kubernetes Prometheus SQL Forecasting BI platforms Modern ML system architecture Capacity planning Data modeling
Remote Hybrid

Senior AI/ML Engineer

General Motors (GM)

Remote (Mountain View, CA) 6 days ago $170,600$261,300
Python Transformers Generative_AI Multimodal_Systems AutoML Quantization Model_Distillation Architecture_Search CVPR ICML NeurIPS IJCAI KDD Robotics_Conference_Papers AV_ADAS_Experience
Remote Hybrid

Senior AI/ML Engineer

Uber

Seattle, WA 11 days ago $202,000$202,000
Python PyTorch TensorFlow XGBoost LightGBM Kafka Pinot Hive Cassandra Spark Flink CI/CD AWS Kubernetes

Senior ML/AI Engineer

Genworth Financial

Richmond, VA 38 days ago $114,900$114,900
Python Databricks MLflow Spark Delta_Lake Feature_Store CI/CD MLOps A/B_Testing Kubernetes AWS Azure SQL LLM RAG Prometheus Grafana
Hybrid

Senior AI Performance and Efficiency Engineer

Nvidia

Remote (Santa Clara, CA) 81 days ago $152,000$241,500
Python Go Bash AWS GCP Azure CUDA NCCL MLPerf PyTorch TensorFlow NSight_Systems NSight_Compute InfiniBand IBOP RDMA Lustre GPFS Kubernetes Docker
Remote

Senior High-Performance AI Training Engineer

Nvidia

Santa Clara, CA 116 days ago $184,000$287,500
Python C++ CUDA MLPerf NVIDIA_Deep_Learning_Platform GPU Computer_Architecture Performance_Modeling CI/CD Docker Kubernetes Terraform AWS Prometheus Grafana