Senior DL Algorithms Engineer - Inference Performance

Nvidia

Remote Actively hiring
Remote, US · Santa Clara, CA Posted 24 days ago $152,000$241,500 / year

At a glance

AI generated

TL;DR

Join our dynamic team as a Senior DL Algorithms Engineer, focusing on optimizing large language models and omni models for peak performance. You will enable state-of-the-art open-source models like Nemotron and Cosmos on NVIDIA’s accelerated inference software stack, contributing to frameworks such as TRT-LLM and vLLM by profiling bottlenecks and delivering production code. Your role involves benchmarking competitive offerings and collaborating with partner teams to develop next-generation AI services. Ideal candidates have a PhD in CS, EE, or CSEE, 3+ years of experience, and expertise in deep learning inference, performance optimization for GPU-based applications, and proficiency in PyTorch or similar frameworks. Strong skills in computer architecture, GPU programming (CUDA/OpenCL), and algorithm fundamentals are essential for this fast-paced environment at the forefront of AI innovation.

Skills

PyTorch NVIDIA_TRT-LLM vLLM SGLang FlashInfer GPU_architecture CUDA OpenCL Deep_Learning Neural_Networks Performance_profiling HPC Computer_Architecture Python C++

What you'll do

  • Enable and optimize state-of-the-art open models on NVIDIA’s accelerated inference software stack.
  • Contribute new features, fix bugs, and deliver production code to open-source deep learning frameworks.
  • Profile and analyze bottlenecks across the full inference stack to enhance performance.
  • Benchmark state-of-the-art offerings and perform competitive analysis for NVIDIA’s SW/HW stack.
  • Co-design next-generation AI models and services with partner teams.

What we're looking for

  • PhD in CS, EE, CSEE or equivalent 3+ years of relevant industry experience.
  • Expertise in deep learning and neural networks, focusing on inference optimization.
  • Proficiency in performance profiling, analysis, and optimization for GPU-based applications.
  • Strong background in computer architecture, including GPU fundamentals.
  • Experience with processor and system-level performance optimization techniques.
  • Familiarity with modern LLM/Diffusion architectures and open-source frameworks.

Market check

Salary context

This $152,000–$241,500 range sits above 39% of similar postings on FindRole.

Peer median band

$166,000$253,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$182,125$238,250

Middle half of comparable postings.

Based on 239 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Deep Learning Software Engineer, Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 24 days ago $184,000$287,500
C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration
Remote

Senior Software Engineer - AI Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 45 days ago $152,000$241,500
Python C++ CUDA vLLM SGLang PyTorch Triton NCCL Dynamo CI/CD GPU InfiniBand Profiling Flamegraphs Microbenchmarks Concurrency Multi-threading Multi-process Kubernetes Docker PostgreSQL
Remote

Senior Software Engineer, Machine Learning Inference

Nvidia

Us, Ca, Santa Clara, US 49 days ago $152,000$241,500
C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD

Senior ML Infrastructure Engineer, Inference Platform

General Motors (GM)

Remote (Gm Automation - Sunnyvale - Gm Automation - Sunnyvale, US) 77 days ago $155,420$205,900
Python Triton RayServe vLLM C++ Kubernetes Docker CI/CD Prometheus Grafana PostgreSQL Redis AWS Azure Google Cloud Platform Git Jenkins GitHub Slack Confluence Jira
Remote