Senior Deep Learning Software Engineer, TensorRT Performance

Nvidia

Remote Actively hiring

Santa Clara, CA Posted 73 days ago $152,000–$241,500 / year

View original post Log in to save

At a glance

AI generated

TL;DR

As a Senior Deep Learning Software Engineer at NVIDIA, you will join the DL Architecture team to enhance the performance of NVIDIA’s inference ecosystem, focusing on frameworks like TensorRT and PyTorch. Your daily tasks include establishing benchmarking methodologies, identifying performance bottlenecks, and optimizing state-of-the-art models across various NVIDIA accelerators. You will contribute to open-source projects, develop new model pipelines for optimized performance, and collaborate with cross-functional teams to innovate inference solutions in areas such as generative AI, automotive, and robotics. The ideal candidate has at least 3 years of experience in software development, expertise in C++ and Python, and a deep understanding of GPU architecture and modern deep learning models. Proficiency in CUDA or related domain-specific languages is essential, along with contributions to major LLM inference frameworks or graph compilers. This role demands strong skills in performance analysis and optimization for both high-performance data centers and resource-constrained edge devices.

Skills

C++ Python TensorRT PyTorch JAX TensorFlow ONNX CUDA GPU Transformers Recommenders ASR TTS Visual_Understanding TorchDynamo TorchInductor CI/CD

What you'll do

Establish performance benchmarking methodologies for NVIDIA’s inference ecosystem.
Contribute features to OSS frameworks like TensorRT and Torch-TensorRT.
Develop optimized model pipelines for areas such as quantization and memory management.
Work with cross-functional teams on innovative inference solutions for various AI domains.
Scale deep learning model performance across different types of NVIDIA accelerators.

What we're looking for

At least 3 years of relevant software development experience.
Strong C++ and Python programming skills with deep learning framework expertise.
Experience with performance analysis and optimization for GPU-accelerated systems.
Proficiency in one domain-specific language for deep learning (e.g., CUDA).
Deep understanding of modern deep learning models and workloads across various domains.

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $209k

This role $197k

$141k most similar roles pay here $252k

This role pays less than 58% of similar roles. Most pay $182,125–$235,750 — the shaded band above. At the midpoint, this role pays about $197k versus about $209k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia

Santa Clara, CA 91 days ago $152,000–$241,500

Python C++ PyTorch JAX CUDA cuBLAS cuDNN cuSOLVER GPU MLPerf OpenAI_Triton Pallas CI/CD

Save