Senior Software Engineer, Machine Learning Inference

Nvidia

Actively hiring
Us, Ca, Santa Clara, US Posted 48 days ago $152,000$241,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA’s TensorRT team as a Senior Software Engineer where you will design and implement optimizations for deep learning inference software on NVIDIA GPUs. You’ll work with C++, Python, and CUDA to develop state-of-the-art LLMs and Generative AI models, collaborating closely with experts across the company to influence both hardware and software designs. Ideal candidates have 4+ years of experience in large-scale software development, strong proficiency in C++ (and Rust or Python), and a background in deep learning frameworks like TensorRT, PyTorch, and JAX. Experience with GPU programming using CUDA, inference backends such as TensorRT-LLM, and performance optimization techniques is highly valued. This role offers the opportunity to contribute to cutting-edge AI solutions that drive real-time computing advancements at scale.

Skills

C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD

What you'll do

  • Design and implement optimizations for NVIDIA TensorRT to enhance AI inference on GPUs.
  • Develop software using C++, Python, and CUDA to deploy advanced LLMs and Generative AI models efficiently.
  • Work with deep learning experts and GPU architects to influence the design of hardware and software for inference.
  • Optimize inference backends and compilers for GPUs to improve performance and efficiency.
  • Analyze and optimize close-to-metal performance in AI applications using TensorRT and other frameworks.

What we're looking for

  • BS, MS, PhD or equivalent experience in Computer Science or related field.
  • 4+ years of software development on large codebases with C++ proficiency.
  • Experience developing Deep Learning Frameworks, Compilers, or System Software.
  • Knowledge of Machine Learning techniques and GPU programming with CUDA/OpenCL.
  • Background working with LLM inference frameworks like TensorRT-LLM, vLLM, SGLang.

Market check

Salary context

This $152,000–$241,500 range sits above 54% of similar postings on FindRole.

Peer median band

$152,000$234,150

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$160,187$230,818

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $184,000$287,500
C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration
Remote

Senior Systems Software Engineer, Machine Learning

Nvidia

Us, Ca, Santa Clara, US 23 days ago $152,000$241,500
Python C/C++ Linux Unix CI/CD Docker Kubernetes AWS TensorFlow PyTorch PostgreSQL MongoDB 3D_Computer_Vision Generative_AI LLMs VLMs Multi-Agent_Systems Computer_Vision Deep_Learning

Senior Software Engineer, AI Inference Systems

Nvidia

Us, Ca, Santa Clara, US 30 days ago $184,000$287,500
Python C/C++ CUDA Kubernetes Docker Triton PyTorch vLLM SGLang MLIR Linux Go Rust CI/CD AWS GCP Azure Prometheus Grafana GitHub MLOps

Senior Software Engineer - AI Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 44 days ago $152,000$241,500
Python C++ CUDA vLLM SGLang PyTorch Triton NCCL Dynamo CI/CD GPU InfiniBand Profiling Flamegraphs Microbenchmarks Concurrency Multi-threading Multi-process Kubernetes Docker PostgreSQL
Remote

Senior Machine Learning Engineer

Adobe

San Francisco, US 42 days ago $211,800$306,625
Python PyTorch TensorFlow Machine_Learning Deep_Learning Data_Science CI/CD Mentorship Collaboration Research_and_Development Product_Integration Cross_Functional_Teams

Senior Machine Learning Engineer

Adobe

San Jose, US 51 days ago $183,300$265,350
Python PyTorch TensorFlow Docker AWS Azure MLOps CI/CD PostgreSQL Adobe Experience Platform Marketo Engage Journey Optimizer LLMs RAG semantic embeddings agentic AI workflows ML inference systems Prometheus Grafana