Senior Software Engineer, Deep Learning Inference - TensorRT

Nvidia

Actively hiring
Us, Ca, Santa Clara, US Posted 72 days ago $152,000$241,500 / year

At a glance

AI generated

TL;DR

Join us as a Senior Software Engineer on the Deep Learning Inference TensorRT software team to develop state-of-the-art inference frameworks for accelerating large language models on NVIDIA GPUs. Your daily tasks will include crafting robust inferencing software, developing components of TensorRT using C++ and Python, and staying updated with AI advancements to enhance TensorRT features. You’ll collaborate closely with deep learning experts, GPU architects, and DevOps engineers across various teams. Ideal candidates have a strong background in computer science or related fields, 3+ years of software development experience, expertise in the latest C++ standards, machine learning concepts, and proficiency in Python and CUDA or OpenCL for GPU kernel programming. Experience with TensorRT, PyTorch, TensorFlow, ONNX Runtime, and compiler development is highly valued as you contribute to a real-time computing platform that drives our success in this rapidly evolving field.

Skills

C++ Python CUDA TensorRT PyTorch TensorFlow ONNX Runtime NVIDIA GPUs Machine Learning Performance Benchmarking Profiling Optimizations Compiler Development Graph Parsers Optimizers

What you'll do

  • Develop robust inferencing software for multiple platforms using NVIDIA GPUs.
  • Build components of TensorRT, an SDK for high-performance deep learning inference.
  • Stay updated with AI academic developments to enhance TensorRT features.
  • Use C++ and Python to create graph parsers, optimizers, and deployment tools.
  • Optimize performance and scalability of deep learning models on NVIDIA GPUs.

What we're looking for

  • Bachelor's degree or equivalent experience in Computer Science or related field.
  • 3+ years of software development experience with C++11/C++14/C++17/C++20.
  • Strong understanding of machine learning concepts, computer architecture, and algorithms.
  • Experience developing system software and optimizing performance using profiling tools.
  • Proficiency in Python and GPU kernel programming with CUDA or OpenCL.

Market check

Salary context

This $152,000–$241,500 range sits above 44% of similar postings on FindRole.

Peer median band

$161,300$241,500

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$179,832$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $184,000$287,500
C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration
Remote

Senior Deep Learning Software Engineer

Nvidia

US 84 days ago $224,000$356,500
Python PyTorch JAX CUDA TensorRT NVIDIA_TensorRT_LLM GPU_optimization CUTLASS Triton Deep_learning_frameworks Performance_analysis GPU_architecture High_performance_computing Model_inference Inference_optimization

Senior Software Engineer, Machine Learning Inference

Nvidia

Us, Ca, Santa Clara, US 48 days ago $152,000$241,500
C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD

Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026

Nvidia

Us, Ca, Santa Clara, US 55 days ago $124,000$195,500
C++ Python TensorRT PyTorch CUDA ONNX JAX TensorFlow performance analysis GPU architecture Transformers Recommenders ASR TTS Visual Understanding graph compilers Jetson systems deep learning inference low-latency systems resource-constrained systems

Senior Deep Learning Software Engineer, LLM Performance

Nvidia

Us, Ca, Santa Clara, US 42 days ago $184,000$287,500
Python C++ CUDA TensorRT Triton PyTorch JAX TensorFlow VLLM SGLang DL compiler Performance modeling Profiling Debugging Code optimization GPU programming Deep learning framework CI/CD