Engineering Manager, Inference Benchmarking — AI Perf

Nvidia

Remote Actively hiring Posted this week

Santa Clara, CA · Austin, TX Posted 4 days ago $224,000–$356,500 / year

View original post Log in to save

At a glance

AI generated

TL;DR

As Technical Lead Manager within NVIDIA’s Dynamo organization, you will lead the engineering team responsible for advancing AIPerf, a critical benchmarking platform used by hyperscalers and enterprises to assess large language model (LLM) serving performance. Your daily tasks include driving the technical roadmap for core infrastructure components like load generation and GPU telemetry, ensuring the accuracy of benchmark results relied upon by industry engineers. You will also advise on upstream engine integrations with vLLM, TRT-LLM, and SGLang to maintain AIPerf’s relevance in emerging hardware and workload categories. This role requires expertise in systems engineering, inference infrastructure, and open-source communities, along with experience in Kubernetes-native deployment and GPU observability tools like DCGM and PyNVML. Ideal candidates have a background in competitive benchmarking frameworks and a proven track record of leading high-velocity, high-visibility projects.

Skills

Kubernetes vLLM TRT-LLM SGLang DCGM PyNVML Prometheus ZMQ Helm CI/CD Python Linux GPU TensorRT MLPerf OpenSource Docker Git GitHub MLOps

What you'll do

Drive the technical roadmap for AIPerf's core infrastructure including load generation and microservices.
Ensure accuracy and statistical soundness of benchmark results relied upon by engineering groups worldwide.
Advise on upstream engine integrations involving vLLM, TRT-LLM, and SGLang to maintain platform relevance.
Hire and mentor senior engineers in a high-velocity open-source environment with global contributors.
Build Kubernetes-native infrastructure including operators, Helm charts, and GPU observability tooling.

What we're looking for

8+ years of software engineering experience in performance-critical infrastructure and distributed systems.
3+ years of leadership experience as a tech lead, TLM, or engineering manager.
Deep understanding of LLM inference mechanics and measurement correctness.
Proven track record of collaborating across multi-functional groups in high-velocity environments.
Extensive experience with vLLM, TRT-LLM, SGLang internals, and contributions to their upstream projects.
Experience building Kubernetes-native infrastructure including operators and GPU observability tooling.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 825 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 813 roles with salary data.

Most-posted roles

View all roles at Nvidia