Manager, Large Language Model Inference

Nvidia

Actively hiring

Remote (Us, Ca, Santa Clara, US) Posted 67 days ago $184,000–$287,500 / year

View original post Log in to save

At a glance

AI generated

TL;DR

Join NVIDIA as an Engineering Manager leading the development of next-generation large language model (LLM) and vision language model (VLM) inference software technologies on TensorRT. You will manage and grow a team focused on specialized kernel development, runtime optimizations, and frameworks for LLM inference, collaborating closely with researchers and GPU architects to deliver high-performance software for NVIDIA’s enterprise and edge hardware platforms. This role requires expertise in C++ or Python, deep knowledge of GPU architecture and CUDA programming, and experience leading distributed engineering teams. Ideal candidates have a background in LLM/VLM inference and familiarity with frameworks like TensorRT-LLM, vLLM, or SGLang, aiming to build scalable, user-friendly APIs that empower the AI ecosystem.

Skills

C++ Python TensorRT-LLM vLLM SGLang CUDA GPU NVIDIA CI/CD Docker Kubernetes Terraform PostgreSQL Git Jenkins Prometheus Grafana

What you'll do

Lead specialized kernel development and runtime optimizations for LLM inference.
Drive the design and delivery of production inference software for next-gen hardware.
Integrate cutting-edge technologies to offer an intuitive developer experience for LLMs.
Execute software development with responsibility for project planning and milestone delivery.
Coordinate cross-functionally across teams including GPU Architects and NVIDIA Researchers.

What we're looking for

MS, PhD, or equivalent experience in Computer Science, AI, or related field.
7+ years of software engineering experience and 3+ years of technical leadership.
Proven ability to lead and scale high-performing distributed engineering teams.
Expertise in C++ or Python with strong background in GPU architecture and CUDA programming.
Demonstrated expertise in large language models (LLM) and/or vision language models (VLM).
Experience integrating cutting-edge technologies for intuitive developer experience.

Market check

Salary context

This $184,000–$287,500 range sits above 62% of similar postings on FindRole.

Peer median band

$184,000–$262,800

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$196,750–$249,821

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

Similar roles

Manager, Deep Learning Algorithms

Nvidia

Us, Ca, Santa Clara, US 121 days ago $184,000–$287,500

Python C++ TensorFlow PyTorch Large Language Models (LLMs) Large Visual-Language Models (VLMs) Inference platforms TensorRT-LLM vLLM SGLang JIRA Microsoft Project CI/CD Git GitHub Docker Kubernetes AWS GCP Azure

Manager, Deep Learning Algorithms

Nvidia

Us, Ca, Santa Clara, US 139 days ago $224,000–$356,500

Python TensorFlow PyTorch Large Language Models (LLMs) Large Visual-Language Models (VLMs) TensorRT-LLM vLLM SGLang JIRA Microsoft Project Git GitHub CI/CD Docker Kubernetes AWS NVIDIA GPU Deep Learning Performance Tuning Inference Optimization

Senior Research Scientist, Multi-Modal Language Models

Nvidia

Us, Ca, Santa Clara, US 113 days ago $192,000–$304,750

Python PyTorch Distributed Systems Deep Learning OpenSource CI/CD Computer Vision MultiModal LLMs NVIDIA Nemotron Multi-modal Technology Algorithms Data Structures Parallel Computing Systems Programming

Machine Learning Engineering Manager, Model Delivery

Autodesk

Amer - United States - California - San Francisco - Pier 9, US 87 days ago $148,500–$266,200

AWS Azure GCP CI/CD Kubernetes Terraform Python PostgreSQL Prometheus Grafana Git Trusted AI 3D data CAD BIM generative AI systems

Machine Learning Scientist (L4) - Content & Conversation Modeling

Netflix

Remote (Seattle, US) 126 days ago $300,000–$537,000

Python scikit-learn Keras PyTorch TensorFlow MetaFlow JAX ML Ops CI/CD MLOps AWS Google Cloud Azure Docker Kubernetes Prometheus Grafana PostgreSQL Redis Git GitHub Slack

Remote

Senior Manager, Machine Learning

Adobe

San Jose, US 8 days ago $219,500–$317,775

Python MLOps AWS GCP Azure Claude Code Cursor AI LLMs MLOps pipelines Generative AI Agentic AI systems