Senior DL Algorithms Engineer - Inference Performance

Nvidia

Actively hiring

Remote (Us, Ca, Santa Clara, US) Posted 99 days ago $184,000–$287,500 / year

View original post Log in to save

At a glance

AI generated

TL;DR

Join NVIDIA as a Senior DL Algorithms Engineer and contribute to the optimization of Deep Learning workloads by implementing inference for language and multimodal models within NVIDIA Inference Microservices (NIMs). You will enhance TRT-LLM, an open-source inference serving library, through feature development and bug fixing while profiling bottlenecks across the entire stack to maximize performance. Collaborate with cross-functional teams to benchmark state-of-the-art DL model inferences and optimize the NVIDIA software/hardware stack for cutting-edge AI services. Ideal candidates hold a PhD or equivalent experience, possess deep expertise in deep learning inference, and are proficient in C++, PyTorch, and GPU programming (CUDA/OpenCL). Strong knowledge of computer architecture and modern LLM architectures is essential.

Skills

C++ PyTorch CUDA OpenCL Python DeepLearningFrameworks GPUArchitecture PerformanceProfiling AlgorithmDesign CI/CD

What you'll do

Implement language and multimodal model inference as part of NVIDIA Inference Microservices.
Contribute new features and fix bugs in TRT-LLM, an open-source inference serving library.
Profile and analyze bottlenecks to optimize performance across the full inference stack.
Benchmark state-of-the-art DL models and perform competitive analysis for NVIDIA’s SW/HW stack.
Collaborate with co-design teams on creating next-generation AI-powered services.

What we're looking for

PhD in CS, EE, CSEE or equivalent 5+ years of relevant industry experience.
Strong background in deep learning and neural network inference.
Expertise in performance profiling, analysis, and optimization for GPU-based applications.
Proficiency in C++ and PyTorch or similar frameworks.
Deep understanding of computer architecture, including GPU fundamentals.
Experience with processor and system-level performance optimization.
Familiarity with modern large language model architectures.

Market check

Salary context

This $184,000–$287,500 range sits above 74% of similar postings on FindRole.

Peer median band

$165,000–$245,600

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$182,125–$238,250

Middle half of comparable postings.

Based on 239 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

Similar roles

Senior DL Algorithms Engineer - Inference Performance

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $152,000–$241,500

PyTorch NVIDIA_TRT-LLM vLLM SGLang FlashInfer GPU_architecture CUDA OpenCL Deep_Learning Neural_Networks Performance_profiling HPC Computer_Architecture Python C++

Remote

Senior Compiler Engineer, AI Inference Performance

Nvidia

Remote (Us, Ca, Santa Clara, US) 93 days ago $152,000–$241,500

MLIR LLVM XLA Triton PyTorch JAX Nsight Compute CUDA C++ Python CI/CD

Remote

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $184,000–$287,500

C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration

Remote

Senior Software Engineer, Machine Learning Inference

Nvidia

Us, Ca, Santa Clara, US 48 days ago $152,000–$241,500

C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD

Senior ML Infrastructure Engineer, Inference Platform

General Motors (GM)

Remote (Gm Automation - Sunnyvale - Gm Automation - Sunnyvale, US) 76 days ago $155,420–$205,900

Python Triton RayServe vLLM C++ Kubernetes Docker CI/CD Prometheus Grafana PostgreSQL Redis AWS Azure Google Cloud Platform Git Jenkins GitHub Slack Confluence Jira

Remote

Senior Machine Learning Test Engineer

Autodesk

Remote (Amer - United States - Massachusetts - Boston - Drydock, US) 14 days ago

Python CI/CD GitHub Actions Jenkins PyTorch TensorFlow Great Expectations Airflow Metaflow Comet MLflow Weights & Biases AWS Azure Google Cloud Platform

Remote