Manager, Large Language Model Inference
Nvidia
Quick summary
Market check
How this pay compares to similar roles
This role pays more than 94% of similar roles. Most pay $169,780–$244,070 — the shaded band above. At the midpoint, this role pays about $290k versus about $207k for comparable roles.
Based on 239 similar postings.
Employer
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 942 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 931 roles with salary data.
Most-posted roles
At a glance
At NVIDIA, join as an Engineering Manager leading a high-impact team focused on accelerating large language model (LLM) and vision language model (VLM) inference across open-source frameworks like TensorRT LLM, vLLM, and SGLang. You will architect and guide the development of performance-critical features for current and future NVIDIA datacenter products, collaborating closely with researchers and GPU architects to deliver cutting-edge software that sets global standards in AI performance. This role requires a strong background in C++ or Python, expertise in LLM inference, and deep knowledge of GPU architecture and CUDA programming. Ideal candidates have 7+ years of software engineering experience, including 3+ years in technical leadership roles, with proven success in managing distributed teams and delivering production-quality software libraries.
Skills
What you'll do
What we're looking for
More like this
Nvidia
Nvidia
Nvidia
Shopify
Chime
Navan