AI Software Engineer, Kernel Libraries - New College Grad 2026

Nvidia

Quick summary

Work type: On-site
Location: Santa Clara, CA
Salary: $124,000–$195,500 / yr
Posted: 3 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $198k

This role $160k

$110k most similar roles pay here $252k

This role pays less than 75% of similar roles. Most pay $159,937–$235,750 — the shaded band above. At the midpoint, this role pays about $160k versus about $198k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 997 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 984 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · AI Software Engineer, Kernel Libraries - New College Grad 2026

Apply Now Log in to save

Join our team of AI systems engineers at NVIDIA to develop cutting-edge technologies for accelerating AI inference. You will design and build libraries, code generators, and GPU kernel technologies tailored for NVIDIA’s hardware architecture, focusing on efficient attention kernels, LLM inference runtimes components, and just-in-time domain-specific compilers. Collaborate closely with cross-functional teams across deep learning frameworks, libraries, and GPU architectures to innovate and optimize high-impact AI workloads. Ideal candidates have a master's degree in Computer Science or Electrical Engineering, 2+ years of experience in ML/DL systems development, and expertise in Python, C/C++, and GPU kernel development using CUDA C/C++. Strong background in domain-specific compiler solutions for LLM inference engines like vLLM and SGLang is essential.

Skills

Python C++ CUDA cuTelemetry Triton PyTorch JAX TensorFlow ONNX vLLM SGLang MLIR FlashInfer Apache TVM NVIDIA GPU Architecture Deep Learning Frameworks Domain Specific Compilers

What you'll do

Design and implement new abstractions for LLM serving engines.
Develop efficient attention kernel implementations for AI workloads.
Build just-in-time domain-specific compilers and runtimes for AI inference.
Optimize GPU kernels to accelerate large language models and agents.
Contribute to open-source communities like FlashInfer, vLLM, and SGLang.

What we're looking for

Masters degree in Computer Science, Electrical Engineering, or related field; PhD preferred
2+ years of experience in ML/DL systems development
Strong expertise in deep learning frameworks and inference engines
Proficiency in Python and C/C++ programming
Experience with domain-specific compiler solutions for LLM inference
Expertise in GPU kernel development and performance optimizations
Contributions to open source projects like FlashInfer, vLLM, SGLang

Similar roles

AI and FSI Developer Technology Engineer - New College Grad 2026

Nvidia

Remote (Santa Clara, CA) +1 70 days ago $124,000–$195,500

CUDA C/C++ GPU CPU TensorRT TensorRT-LLM cuTile Python Linux NVIDIA HPC CI/CD Git Docker Kubernetes PostgreSQL Redis MongoDB AWS Azure Grafana Prometheus

Remote

Save

Deep Learning Kernel Software Performance Architect - New College Grad 2026

Nvidia

Santa Clara, CA 63 days ago $124,000–$195,500

Python C C++ GPU CUDA Parallel_Programming Performance_Analysis Profiling Machine_Learning Deep_Learning Computer_Architecture High_Performance_Computing Energy_Efficient_Designs Analytical_Modeling NVIDIA_CUDA AI_Compiler

Save

AI Chip Design Engineer - New College Grad 2026

Nvidia

Santa Clara, CA 79 days ago $116,000–$189,750

Python LangChain LangGraph AutoGen CrewAI RAGs vector databases prompt engineering knowledge graphs Verilog System Verilog temporal logic assertions RFL GPU architectures CPU architectures HW verification methodologies

Save

Software Engineer, TensorRT Specialized Platforms - New College Grad 2025

Nvidia

Santa Clara, CA 20 days ago $124,000–$195,500

C++ CUDA Python Modern C++ standards C++ Standard Template Library (STL) Deep learning models Performance optimization Systems programming Embedded systems Compiler concepts Software performance analysis Profiling techniques Computer architecture Memory management Parallel computing concepts

Save

Deep Learning Computer Architect - New College Grad 2026

Nvidia

Santa Clara, CA +1 17 days ago $124,000–$195,500

C++ Python CUDA PyTorch GPU Computer_Architecture Performance_Analysis Deep_Learning_Kernels LLM_Workloads Parallelization Fusion_Strategies

Hybrid

Save

Senior Applied AI Software Engineer ( AI)

Humana

New York, NY +2 13 days ago $141,100–$194,000

Python React Vue Solid Angular LangChain LlamaIndex Semantic Kernel Prompt engineering Model selection Context management Generative AI deployment Claude Code Cursor Replit AI system evaluation Observability Reliability Fallback strategies Safety guardrails Latency optimization Rate-limit management

Hybrid

Save