Senior AI Inference Compiler Engineer

Nvidia

Remote

Quick summary

Work type: Remote
Location: Santa Clara, CA · Austin, TX
Salary: $152,000–$241,500 / yr
Posted: 102 days ago

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $221k

This role $197k

$139k most similar roles pay here $274k

This role pays less than 68% of similar roles. Most pay $195,000–$246,150 — the shaded band above. At the midpoint, this role pays about $197k versus about $221k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 563 open roles on FindRole.

Listed pay typically runs $168,000–$264,500 across 556 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior AI Inference Compiler Engineer

Apply Now Log in to save

As an AI & Deep Learning Compiler Engineer at NVIDIA’s DLC team, you will work on developing compiler intermediate representations (IR), programming models, and optimizations for future GPU architectures. This role involves collaborating closely with deep learning software framework teams and hardware architects to enhance the performance of next-generation inference engines across various domains including data centers, personal devices, automotive, and robotics. Key responsibilities include defining public APIs, conducting performance analyses, implementing compiler optimizations, and generating high-performance GPU kernels for neural networks. Ideal candidates possess a strong background in compiler technologies like MLIR, XLA, and LLVM, along with expertise in deep learning frameworks such as PyTorch and XLA, and proficiency in GPU architecture and kernel generation.

Skills

MLIR XLA LLVM PyTorch GPU CUDA C++ Compiler Technologies Deep Learning Models LLM Inference Optimizations High Performance Computing Fast Build Time Kernel Generation Neural Networks Software Engineering

What you'll do

Develop compiler IR and programming models for future GPU architectures.
Optimize compilers for leading inference performance and reduced memory footprints.
Implement Ahead-of-Time and Just-in-Time compilation techniques for efficiency.
Collaborate on defining public APIs and performance optimizations for deep learning.
Generate high-performance GPU kernels with fast build times.

What we're looking for

Bachelor’s degree or higher in relevant field.
Experience with compiler technologies like MLIR, XLA, and LLVM.
Strong understanding of deep learning models and frameworks (e.g., PyTorch).
Proficiency in GPU architecture and kernel generation for high performance.
Knowledge of LLM inference optimizations and techniques.
Ability to work effectively in a fast-paced, dynamic team environment.

Save