Senior Deep Learning Software Engineer, Inference

Nvidia

Remote Actively hiring
Remote, US · Santa Clara, CA Posted 24 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

As a Senior Software Engineer specializing in Deep Learning Inference at NVIDIA, you will join our dynamic team responsible for developing high-performance deep learning frameworks like SGLang and vLLM. Your daily tasks will include optimizing GPU-accelerated software for AI applications, implementing cutting-edge algorithms, and enhancing model serving pipelines using tools such as CUTLASS, OAI Triton, NCCL, and CUDA kernels. You will focus on performance improvements across various NVIDIA accelerators, from data centers to edge devices, ensuring efficient deployment of large-scale language models. Ideal candidates have a strong background in C/C++ programming, experience with deep learning frameworks like PyTorch, and knowledge of GPU programming (CUDA). Prior work with multi-GPU communications tools such as NCCL is also beneficial.

Skills

C++ Python CUDA NCCL NVSHMEM OAI_TRITON CUTLASS PyTorch vLLM SGLang FlashInfer Multi-GPU_Communications Deep_Learning_Frameworks Performance_Optimization GPU_Acceleration

What you'll do

  • Optimize performance of deep learning models across various domains including LLM, multimodal, and generative AI.
  • Enhance scalability of DL models on different NVIDIA accelerator architectures.
  • Develop features and optimize code for NVIDIA’s inference libraries vLLM and SGLang.
  • Implement and improve model serving pipelines using open-source tools like CUTLASS and OAI Triton.
  • Collaborate with the deep learning community to integrate latest algorithms into frameworks such as PyTorch, vLLM, and SGLang.

What we're looking for

  • 6+ years of relevant software development experience with a focus on deep learning inference.
  • Strong C/C++ programming skills and proficiency in GPU programming (CUDA).
  • Experience optimizing DL models for performance across various NVIDIA accelerators.
  • Background in multi-GPU communications using tools like NCCL and NVSHMEM.
  • Contribution to open-source Deep Learning Software projects such as PyTorch, vLLM, and SGLang.
  • Knowledge of Python and experience with software design and agile development methodologies.

Market check

Salary context

This $184,000–$287,500 range sits above 75% of similar postings on FindRole.

Peer median band

$160,900$241,500

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$182,125$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Deep Learning Software Engineer

Nvidia

US 85 days ago $224,000$356,500
Python PyTorch JAX CUDA TensorRT NVIDIA_TensorRT_LLM GPU_optimization CUTLASS Triton Deep_learning_frameworks Performance_analysis GPU_architecture High_performance_computing Model_inference Inference_optimization

Senior Deep Learning Software Engineer, LLM Performance

Nvidia

Us, Ca, Santa Clara, US 43 days ago $184,000$287,500
Python C++ CUDA TensorRT Triton PyTorch JAX TensorFlow VLLM SGLang DL compiler Performance modeling Profiling Debugging Code optimization GPU programming Deep learning framework CI/CD

Senior Software Engineer - AI Inference

Nvidia

Remote (Us, Ca, Santa Clara, US) 45 days ago $152,000$241,500
Python C++ CUDA vLLM SGLang PyTorch Triton NCCL Dynamo CI/CD GPU InfiniBand Profiling Flamegraphs Microbenchmarks Concurrency Multi-threading Multi-process Kubernetes Docker PostgreSQL
Remote

Senior Software Engineer, Machine Learning Inference

Nvidia

Us, Ca, Santa Clara, US 49 days ago $152,000$241,500
C++ Python CUDA Rust TensorRT TensorRT-LLM vLLM SGLang PyTorch JAX Deep Learning Frameworks GPU Programming Performance Analysis Optimization Techniques CI/CD