Senior AI Software Engineer, Kernel Libraries

Nvidia

Remote

Quick summary

Work type: Remote
Location: Santa Clara, CA
Salary: $184,000–$287,500 / yr
Posted: 4 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $215k

This role $236k

$166k most similar roles pay here $301k

This role pays more than 72% of similar roles. Most pay $184,712–$246,150 — the shaded band above. At the midpoint, this role pays about $236k versus about $215k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 563 open roles on FindRole.

Listed pay typically runs $168,000–$264,500 across 556 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior AI Software Engineer, Kernel Libraries

Apply Now Log in to save

Join our team of AI systems engineers at NVIDIA to develop cutting-edge technologies for accelerating AI inference. As a senior engineer, you will design and build innovative libraries, code generators, and GPU kernel technologies tailored for NVIDIA’s hardware architecture, focusing on efficient attention kernels, LLM inference runtimes, and domain-specific compilers. You’ll collaborate closely with cross-functional teams across deep learning frameworks, libraries, and GPU architectures to optimize high-impact AI workloads. Ideal candidates hold a master's degree in Computer Science or Electrical Engineering, preferably with PhDs, and have extensive experience in developing or using deep learning frameworks like PyTorch and TensorFlow, as well as expertise in inference engines such as vLLM and SGLang, machine learning compilers like Apache TVM, and GPU kernel development using CUDA C/C++.

Skills

PyTorch TensorFlow JAX ONNX vLLM SGLang MLC FlashInfer Apache TVM MLIR CUDA C/C++ cuTelemetry Triton NVIDIA GPU Architecture Domain Specific Compilers Open Source Contributions

What you'll do

Design and implement new abstractions for LLM serving engines.
Develop efficient attention kernel implementations for AI workloads.
Build just-in-time domain-specific compilers and runtimes for AI inference.
Optimize GPU kernels using CUDA C/C++, cuTile, Triton, or similar tools.
Contribute to open source communities like FlashInfer, vLLM, and SGLang.

What we're looking for

Masters degree in Computer Science, Electrical Engineering, or related field; PhD preferred
6+ years of experience with ML/DL systems development
Strong experience developing or using deep learning frameworks and inference engines
Expertise in domain-specific compiler solutions for LLM inference and training
Experience in GPU kernel development and performance optimizations using CUDA C/C++
Open source project ownership or significant contributions
Knowledge of machine learning compilers like Apache TVM and MLIR