Senior DL Algorithms Engineer - Inference Performance

Nvidia

Actively hiring
Remote (Us, Ca, Santa Clara, US) Posted 99 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA as a Senior DL Algorithms Engineer and contribute to the optimization of Deep Learning workloads by implementing inference for language and multimodal models within NVIDIA Inference Microservices (NIMs). You will enhance TRT-LLM, an open-source inference serving library, through feature development and bug fixing while profiling bottlenecks across the entire stack to maximize performance. Collaborate with cross-functional teams to benchmark state-of-the-art DL model inferences and optimize the NVIDIA software/hardware stack for cutting-edge AI services. Ideal candidates hold a PhD or equivalent experience, possess deep expertise in deep learning inference, and are proficient in C++, PyTorch, and GPU programming (CUDA/OpenCL). Strong knowledge of computer architecture and modern LLM architectures is essential.

Skills

C++ PyTorch CUDA OpenCL Python DeepLearningFrameworks GPUArchitecture PerformanceProfiling AlgorithmDesign CI/CD

What you'll do

  • Implement language and multimodal model inference as part of NVIDIA Inference Microservices.
  • Contribute new features and fix bugs in TRT-LLM, an open-source inference serving library.
  • Profile and analyze bottlenecks to optimize performance across the full inference stack.
  • Benchmark state-of-the-art DL models and perform competitive analysis for NVIDIA’s SW/HW stack.
  • Collaborate with co-design teams on creating next-generation AI-powered services.

What we're looking for

  • PhD in CS, EE, CSEE or equivalent 5+ years of relevant industry experience.
  • Strong background in deep learning and neural network inference.
  • Expertise in performance profiling, analysis, and optimization for GPU-based applications.
  • Proficiency in C++ and PyTorch or similar frameworks.
  • Deep understanding of computer architecture, including GPU fundamentals.
  • Experience with processor and system-level performance optimization.
  • Familiarity with modern large language model architectures.

Market check

Salary context

This $184,000–$287,500 range sits above 74% of similar postings on FindRole.

Peer median band

$165,000$245,600

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$182,125$238,250

Middle half of comparable postings.

Based on 239 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior DL Algorithms Engineer - Inference Performance

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $152,000$241,500
PyTorch NVIDIA_TRT-LLM vLLM SGLang FlashInfer GPU_architecture CUDA OpenCL Deep_Learning Neural_Networks Performance_profiling HPC Computer_Architecture Python C++
Remote

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 23 days ago $184,000$287,500
C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration
Remote

Senior Software Engineer, Machine Learning Inference

Nvidia

Us, Ca, Santa Clara, US 48 days ago $152,000$241,500
C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD

Senior ML Infrastructure Engineer, Inference Platform

General Motors (GM)

Remote (Gm Automation - Sunnyvale - Gm Automation - Sunnyvale, US) 76 days ago $155,420$205,900
Python Triton RayServe vLLM C++ Kubernetes Docker CI/CD Prometheus Grafana PostgreSQL Redis AWS Azure Google Cloud Platform Git Jenkins GitHub Slack Confluence Jira
Remote

Senior Machine Learning Test Engineer

Autodesk

Remote (Amer - United States - Massachusetts - Boston - Drydock, US) 14 days ago
Python CI/CD GitHub Actions Jenkins PyTorch TensorFlow Great Expectations Airflow Metaflow Comet MLflow Weights & Biases AWS Azure Google Cloud Platform
Remote