Manager, Large Language Model Inference

Nvidia

Actively hiring
Remote (Us, Ca, Santa Clara, US) Posted 67 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA as an Engineering Manager leading the development of next-generation large language model (LLM) and vision language model (VLM) inference software technologies on TensorRT. You will manage and grow a team focused on specialized kernel development, runtime optimizations, and frameworks for LLM inference, collaborating closely with researchers and GPU architects to deliver high-performance software for NVIDIA’s enterprise and edge hardware platforms. This role requires expertise in C++ or Python, deep knowledge of GPU architecture and CUDA programming, and experience leading distributed engineering teams. Ideal candidates have a background in LLM/VLM inference and familiarity with frameworks like TensorRT-LLM, vLLM, or SGLang, aiming to build scalable, user-friendly APIs that empower the AI ecosystem.

Skills

C++ Python TensorRT-LLM vLLM SGLang CUDA GPU NVIDIA CI/CD Docker Kubernetes Terraform PostgreSQL Git Jenkins Prometheus Grafana

What you'll do

  • Lead specialized kernel development and runtime optimizations for LLM inference.
  • Drive the design and delivery of production inference software for next-gen hardware.
  • Integrate cutting-edge technologies to offer an intuitive developer experience for LLMs.
  • Execute software development with responsibility for project planning and milestone delivery.
  • Coordinate cross-functionally across teams including GPU Architects and NVIDIA Researchers.

What we're looking for

  • MS, PhD, or equivalent experience in Computer Science, AI, or related field.
  • 7+ years of software engineering experience and 3+ years of technical leadership.
  • Proven ability to lead and scale high-performing distributed engineering teams.
  • Expertise in C++ or Python with strong background in GPU architecture and CUDA programming.
  • Demonstrated expertise in large language models (LLM) and/or vision language models (VLM).
  • Experience integrating cutting-edge technologies for intuitive developer experience.

Market check

Salary context

This $184,000–$287,500 range sits above 62% of similar postings on FindRole.

Peer median band

$184,000$262,800

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$196,750$249,821

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Manager, Deep Learning Algorithms

Nvidia

Us, Ca, Santa Clara, US 121 days ago $184,000$287,500
Python C++ TensorFlow PyTorch Large Language Models (LLMs) Large Visual-Language Models (VLMs) Inference platforms TensorRT-LLM vLLM SGLang JIRA Microsoft Project CI/CD Git GitHub Docker Kubernetes AWS GCP Azure

Manager, Deep Learning Algorithms

Nvidia

Us, Ca, Santa Clara, US 139 days ago $224,000$356,500
Python TensorFlow PyTorch Large Language Models (LLMs) Large Visual-Language Models (VLMs) TensorRT-LLM vLLM SGLang JIRA Microsoft Project Git GitHub CI/CD Docker Kubernetes AWS NVIDIA GPU Deep Learning Performance Tuning Inference Optimization

Senior Research Scientist, Multi-Modal Language Models

Nvidia

Us, Ca, Santa Clara, US 113 days ago $192,000$304,750
Python PyTorch Distributed Systems Deep Learning OpenSource CI/CD Computer Vision MultiModal LLMs NVIDIA Nemotron Multi-modal Technology Algorithms Data Structures Parallel Computing Systems Programming

Machine Learning Engineering Manager, Model Delivery

Autodesk

Amer - United States - California - San Francisco - Pier 9, US 87 days ago $148,500$266,200
AWS Azure GCP CI/CD Kubernetes Terraform Python PostgreSQL Prometheus Grafana Git Trusted AI 3D data CAD BIM generative AI systems

Senior Manager, Machine Learning

Adobe

San Jose, US 8 days ago $219,500$317,775
Python MLOps AWS GCP Azure Claude Code Cursor AI LLMs MLOps pipelines Generative AI Agentic AI systems