Senior Deep Learning Software Engineer, TensorRT Performance

Nvidia

Remote Actively hiring
Santa Clara, CA Posted 73 days ago $152,000$241,500 / year

At a glance

AI generated

TL;DR

As a Senior Deep Learning Software Engineer at NVIDIA, you will join the DL Architecture team to enhance the performance of NVIDIA’s inference ecosystem, focusing on frameworks like TensorRT and PyTorch. Your daily tasks include establishing benchmarking methodologies, identifying performance bottlenecks, and optimizing state-of-the-art models across various NVIDIA accelerators. You will contribute to open-source projects, develop new model pipelines for optimized performance, and collaborate with cross-functional teams to innovate inference solutions in areas such as generative AI, automotive, and robotics. The ideal candidate has at least 3 years of experience in software development, expertise in C++ and Python, and a deep understanding of GPU architecture and modern deep learning models. Proficiency in CUDA or related domain-specific languages is essential, along with contributions to major LLM inference frameworks or graph compilers. This role demands strong skills in performance analysis and optimization for both high-performance data centers and resource-constrained edge devices.

Skills

C++ Python TensorRT PyTorch JAX TensorFlow ONNX CUDA GPU Transformers Recommenders ASR TTS Visual_Understanding TorchDynamo TorchInductor CI/CD

What you'll do

  • Establish performance benchmarking methodologies for NVIDIA’s inference ecosystem.
  • Contribute features to OSS frameworks like TensorRT and Torch-TensorRT.
  • Develop optimized model pipelines for areas such as quantization and memory management.
  • Work with cross-functional teams on innovative inference solutions for various AI domains.
  • Scale deep learning model performance across different types of NVIDIA accelerators.

What we're looking for

  • At least 3 years of relevant software development experience.
  • Strong C++ and Python programming skills with deep learning framework expertise.
  • Experience with performance analysis and optimization for GPU-accelerated systems.
  • Proficiency in one domain-specific language for deep learning (e.g., CUDA).
  • Deep understanding of modern deep learning models and workloads across various domains.

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $209k
This role $197k
$141k most similar roles pay here $252k

This role pays less than 58% of similar roles. Most pay $182,125–$235,750 — the shaded band above. At the midpoint, this role pays about $197k versus about $209k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 824 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 812 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026

Nvidia

Remote (Santa Clara, CA) 61 days ago $124,000$195,500
C++ Python TensorRT PyTorch CUDA ONNX JAX TensorFlow performance analysis GPU architecture Transformers Recommenders ASR TTS Visual Understanding graph compilers Jetson systems deep learning inference low-latency systems resource-constrained systems
Remote

Senior Deep Learning Software Engineer, LLM Performance

Nvidia

Santa Clara, CA 48 days ago $184,000$287,500
Python C++ CUDA TensorRT Triton PyTorch JAX TensorFlow VLLM SGLang DL compiler Performance modeling Profiling Debugging Code optimization GPU programming Deep learning framework CI/CD
Hybrid

Senior Deep Learning Software Engineer

Nvidia

Santa Clara, CA 40 days ago $224,000$356,500
Python PyTorch JAX CUDA TensorRT NVIDIA_TensorRT_LLM GPU_optimization CUTLASS Triton Deep_learning_frameworks Performance_analysis GPU_architecture High_performance_computing Model_inference Inference_optimization
Hybrid

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Santa Clara, CA) 29 days ago $184,000$287,500
C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration
Remote