Senior Software Engineer, AI Inference Systems

Nvidia

Hybrid Actively hiring
Santa Clara, US Posted 31 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA as a senior software engineer on our cutting-edge AI inference team, where you will architect high-performance inference stacks for large-scale models, optimize GPU kernels, and drive industry benchmarks. Your daily tasks include contributing features to vLLM that leverage the latest NVIDIA hardware, developing optimized GPU kernels using techniques like fusion and autotuning, and building benchmarking methodologies. You’ll also design scheduling systems for containerized deployments on multi-GPU clusters across clouds. Ideal candidates have a strong background in computer science with extensive experience in Python and C/C++, along with knowledge of CUDA, Kubernetes, and ML frameworks such as PyTorch and vLLM. Experience with ML compilers like Triton and GPU libraries like CUTLASS is highly valued. This role offers the opportunity to work on groundbreaking AI technologies that push the boundaries of performance engineering and system optimization at scale.

Skills

Python C/C++ CUDA Kubernetes Docker Triton PyTorch vLLM SGLang MLIR Linux Go Rust CI/CD AWS GCP Azure Prometheus Grafana GitHub MLOps

What you'll do

  • Contribute features to vLLM for the latest NVIDIA GPU hardware.
  • Optimize GPU kernels using fusion, autotuning, and memory/layout optimization.
  • Develop benchmarking methodologies and contribute to MLPerf Inference suite.
  • Architect scheduling and orchestration of large-scale inference deployments on GPU clusters.
  • Conduct original research and integrate new ideas into NVIDIA’s software products.

What we're looking for

  • Bachelor’s degree (or equivalent experience) in CS/CE/SE with 7+ years or Master’s degree with 5+ years of relevant industry experience.
  • Strong programming skills in Python and C/C++; proficiency in algorithms, data structures, operating systems, parallel/distributed computing.
  • Expertise in performance engineering for ML frameworks like PyTorch and inference engines such as vLLM/SGLang.
  • Proficiency in GPU programming with CUDA, memory hierarchy optimization, and use of profiling tools (Nsight Systems/Compute).
  • Experience building and optimizing large language model (LLM) inference engines and contributing to containerization/virtualization technologies.
  • Hands-on experience with ML compilers, DSLs, GPU libraries, and cloud platforms; contributions to open-source projects/publications preferred.

Market check

Salary context

This $184,000–$287,500 range sits above 72% of similar postings on FindRole.

Peer median band

$170,700$247,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$168,250$246,150

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Software Engineer - AI Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 45 days ago $152,000$241,500
Python C++ CUDA vLLM SGLang PyTorch Triton NCCL Dynamo CI/CD GPU InfiniBand Profiling Flamegraphs Microbenchmarks Concurrency Multi-threading Multi-process Kubernetes Docker PostgreSQL
Remote

AI Software Engineer, Senior

Booz Allen Hamilton

Locations Laurel, Maryland, US 42 days ago $86,800$198,000
Python Java C++ JavaScript TypeScript LLM-powered developer tools CI/CD DevOps VS Code Kubernetes Docker GitHub GitLab Jenkins Agentic AI frameworks Orchestration systems Cloud services PostgreSQL MongoDB

AI Software Engineer, Senior

Booz Allen Hamilton

US 42 days ago $86,800$198,000
Python Rust Go Scala Java GitLab CI Jenkins Git Linux Docker Podman AWS LocalStack ESXi Ansible Kubernetes SIEM Security+ Linux+

Senior Software Engineer (AI Platform)

Smartly

US 42 days ago
Python TypeScript PostgreSQL Node.js Docker Kubernetes React AWS GCP CI/CD MLOps PyTorch TensorFlow MLflow Kubeflow

Senior Software Engineer - AI Applications

Plaid

San Francisco Hq, US 42 days ago $209,880$289,080
HTML CSS JavaScript LLM GenAI SSE Vector_Databases Embeddings Agent_Orchestration_Frameworks Prompt_Engineering RAG Semantic_Search CI/CD Python Node.js React Docker Kubernetes AWS PostgreSQL

Senior Software Engineer, Machine Learning Inference

Nvidia

Us, Ca, Santa Clara, US 49 days ago $152,000$241,500
C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD