Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026

Nvidia

Actively hiring Verified listing
Us, Ca, Santa Clara, US Posted 55 days ago $124,000$195,500 / year

At a glance

AI generated

TL;DR

NVIDIA seeks an experienced Deep Learning Software Engineer to join its dynamic research and development team focused on optimizing the performance of deep learning inference solutions across various NVIDIA accelerators. This role involves establishing benchmarking methodologies, identifying performance issues, and contributing features to open-source frameworks like TensorRT and PyTorch. The engineer will develop optimized model pipelines for areas such as quantization and memory management, collaborating with diverse teams to enhance generative AI, automotive, robotics, and speech understanding applications. Essential skills include strong C++ and Python programming, experience with DL frameworks and inference libraries, performance optimization knowledge, and proficiency in GPU architecture and deep learning models like Transformers. Prior contributions to major LLM inference frameworks or graph compilers are highly valued, as is expertise in CUDA or related domain-specific languages.

Skills

C++ Python TensorRT PyTorch CUDA ONNX JAX TensorFlow performance analysis GPU architecture Transformers Recommenders ASR TTS Visual Understanding graph compilers Jetson systems deep learning inference low-latency systems resource-constrained systems

What you'll do

  • Establish performance benchmarking methodologies for NVIDIA’s inference ecosystem.
  • Contribute features and code to open-source inference frameworks like TensorRT.
  • Develop optimized model pipelines for areas such as quantization and scheduling.
  • Collaborate with cross-functional teams on innovative inference solutions.
  • Scale deep learning models’ performance across various NVIDIA accelerator types.

What we're looking for

  • Bachelors/Masters/PhD or equivalent experience in Computer Science, EECS, AI.
  • 2+ years of software development experience with strong C++ and Python skills.
  • Experience with deep learning frameworks (PyTorch, TensorFlow) and inference libraries (TensorRT).
  • Deep understanding of modern deep learning models and workloads like Transformers and Recommenders.
  • Proficiency in GPU architecture and performance optimization for low-latency systems.
  • Prior contributions to major LLM inference frameworks or experience with graph compilers.

Market check

Salary context

This $124,000–$195,500 range sits above 21% of similar postings on FindRole.

Peer median band

$154,637$241,500

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$166,100$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Deep Learning Software Engineer, LLM Performance

Nvidia

Us, Ca, Santa Clara, US 42 days ago $184,000$287,500
Python C++ CUDA TensorRT Triton PyTorch JAX TensorFlow VLLM SGLang DL compiler Performance modeling Profiling Debugging Code optimization GPU programming Deep learning framework CI/CD

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $184,000$287,500
C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration
Remote

Senior Deep Learning Software Engineer

Nvidia

US 84 days ago $224,000$356,500
Python PyTorch JAX CUDA TensorRT NVIDIA_TensorRT_LLM GPU_optimization CUTLASS Triton Deep_learning_frameworks Performance_analysis GPU_architecture High_performance_computing Model_inference Inference_optimization

Senior Software Engineer, Deep Learning Inference - TensorRT

Nvidia

Us, Ca, Santa Clara, US 72 days ago $152,000$241,500
C++ Python CUDA TensorRT PyTorch TensorFlow ONNX Runtime NVIDIA GPUs Machine Learning Performance Benchmarking Profiling Optimizations Compiler Development Graph Parsers Optimizers