Engineering Manager, Inference Benchmarking — AI Perf
At a glance
AI generatedTL;DR
As Technical Lead Manager within NVIDIA’s Dynamo organization, you will lead the engineering team responsible for advancing AIPerf, a critical benchmarking platform used by hyperscalers and enterprises to assess large language model (LLM) serving performance. Your daily tasks include driving the technical roadmap for core infrastructure components like load generation and GPU telemetry, ensuring the accuracy of benchmark results relied upon by industry engineers. You will also advise on upstream engine integrations with vLLM, TRT-LLM, and SGLang to maintain AIPerf’s relevance in emerging hardware and workload categories. This role requires expertise in systems engineering, inference infrastructure, and open-source communities, along with experience in Kubernetes-native deployment and GPU observability tools like DCGM and PyNVML. Ideal candidates have a background in competitive benchmarking frameworks and a proven track record of leading high-velocity, high-visibility projects.
Skills
What you'll do
- Drive the technical roadmap for AIPerf's core infrastructure including load generation and microservices.
- Ensure accuracy and statistical soundness of benchmark results relied upon by engineering groups worldwide.
- Advise on upstream engine integrations involving vLLM, TRT-LLM, and SGLang to maintain platform relevance.
- Hire and mentor senior engineers in a high-velocity open-source environment with global contributors.
- Build Kubernetes-native infrastructure including operators, Helm charts, and GPU observability tooling.
What we're looking for
- 8+ years of software engineering experience in performance-critical infrastructure and distributed systems.
- 3+ years of leadership experience as a tech lead, TLM, or engineering manager.
- Deep understanding of LLM inference mechanics and measurement correctness.
- Proven track record of collaborating across multi-functional groups in high-velocity environments.
- Extensive experience with vLLM, TRT-LLM, SGLang internals, and contributions to their upstream projects.
- Experience building Kubernetes-native infrastructure including operators and GPU observability tooling.
Employer
About Nvidia
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 825 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 813 roles with salary data.
Most-posted roles
- Senior Solutions Architect, AI Infrastructure 4
- Senior System Software Engineer - AV Platform 4
- Senior Circuit Design Engineer 3
- Senior Circuit Methodology Engineer 3
- Senior Deep Learning Performance Architect 3