Engineering Manager, LLM Performance

Nvidia

Hybrid

Quick summary

Work type
Hybrid
Location
Santa Clara, CA
Salary
$224,000–$356,500 / yr
Posted
5 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $207k
This role $290k
$143k most similar roles pay here $379k

This role pays more than 94% of similar roles. Most pay $169,780–$244,070 — the shaded band above. At the midpoint, this role pays about $290k versus about $207k for comparable roles.

Based on 239 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 942 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 931 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Engineering Manager, LLM Performance

At NVIDIA, join as an Engineering Manager leading a high-impact team focused on accelerating large language model (LLM) and vision language model (VLM) inference across open-source frameworks like TensorRT LLM, vLLM, and SGLang. You will architect and guide the development of performance-critical features for current and future NVIDIA datacenter products, collaborating closely with researchers and GPU architects to deliver cutting-edge software that sets global standards in AI performance. This role requires a strong background in C++ or Python, expertise in LLM inference, and deep knowledge of GPU architecture and CUDA programming. Ideal candidates have 7+ years of software engineering experience, including 3+ years in technical leadership roles, with proven success in managing distributed teams and delivering production-quality software libraries.

What you'll do

  • Lead and grow a team to enhance LLM inference performance across multiple frameworks.
  • Drive design and optimization of critical features for LLM inference performance.
  • Improve LLM inference performance on current and future NVIDIA datacenter GPUs.
  • Collaborate with benchmark teams to optimize key workloads' performance.
  • Integrate cutting-edge technologies for intuitive developer experience in LLM deployment.

What we're looking for

  • MS, PhD, or equivalent experience in Computer Science, AI, or related technical field
  • 7+ years of software engineering experience with 3+ years in technical leadership roles
  • Proven ability to lead and scale high-performing distributed engineering teams
  • Expertise in C++ or Python for software design and production-quality libraries
  • Deep understanding of GPU architecture, CUDA programming, and system-level performance tuning
  • Background in LLM inference and experience with frameworks like TensorRT-LLM, vLLM, SGLang

More like this

Similar roles

Manager, Large Language Model Inference

Nvidia

Remote (Canada) +1 98 days ago $184,000$287,500
C++ Python TensorRT-LLM vLLM SGLang CUDA GPU Architecture CI/CD Docker Kubernetes GitHub NVIDIA GPUs PostgreSQL MongoDB Git JIRA Confluence Slack Zoom
Remote

Manager, Deep Learning Algorithms

Nvidia

Santa Clara, CA 170 days ago $224,000$356,500
Python TensorFlow PyTorch Large Language Models (LLMs) Large Visual-Language Models (VLMs) TensorRT-LLM vLLM SGLang JIRA Microsoft Project Git GitHub CI/CD Docker Kubernetes AWS NVIDIA GPU CUDA C++ Linux

Engineering Manager, Inference Benchmarking

Nvidia

Remote (Santa Clara, CA) +4 31 days ago $224,000$356,500
Kubernetes vLLM TRT-LLM SGLang DCGM PyNVML Prometheus ZMQ Helm CI/CD Python Linux GPU TensorRT MLPerf OpenSource Docker Git GitHub MLOps
Remote

Engineering Manager

Shopify

58 days ago
Python JavaScript Kubernetes Docker CI/CD AWS Google Cloud Azure PostgreSQL MongoDB Redis Git Jenkins Terraform Prometheus Grafana

Engineering Manager

Chime

San Francisco, CA 3 days ago
AWS DynamoDB Kinesis Postgres Terraform CircleCI ArgoCD AWS CloudFormation Docker Kubernetes Datadog GitHub Actions PagerDuty AWS CloudWatch Event-driven software architecture
Hybrid

Manager, Engineering

Navan

60 days ago
Java Spring Boot Hibernate React TypeScript SQL NoSQL Agile CI/CD Git Kubernetes Docker AWS GitHub Copilot