Senior Software Engineer, CUTLASS Kernels

Nvidia

Actively hiring Posted this week Verified listing
Santa Clara, CA · Austin, TX · Hillsboro, OR · Durham, NC · Redmond, WA Posted 3 days ago $152,000$241,500 / year

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $180k
This role $197k
$105k most similar roles pay here $256k

This role pays more than 68% of similar roles. Most pay $142,400–$217,725 — the shaded band above. At the midpoint, this role pays about $197k versus about $180k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 855 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 843 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR

Join NVIDIA’s CUTLASS team as a senior software engineer to develop high-performance linear algebra and Tensor Core primitives for AI applications. You will write Tensor Core-based deep learning kernels using CUDA C++ and Python DSL, optimizing them for peak throughput on both silicon and software simulators. Collaborate closely with GPU architecture, compiler, library, and framework teams to deliver efficient kernel solutions. Ideal candidates hold a Masters or PhD in Computer Science or related field with 3+ years of industry experience, strong C++ skills, and expertise in CUDA, PTX, and Tensor Core programming. Experience with open-source contributions to math kernel libraries is preferred. This role involves working on cutting-edge hardware architectures like Blackwell and Rubin, contributing to the AI revolution by enabling real-time, cost-effective computing solutions.

What you'll do

  • Write Tensor Core-based deep learning kernels using CUTLASS CUDA C++ and Python DSL for NVIDIA GPUs.
  • Optimize kernels to achieve peak throughput on both silicon and software performance simulators.
  • Develop custom matrix multiply (GEMM) and related math computations for high-performance linear algebra.
  • Collaborate with GPU architecture teams to ensure timely delivery of optimized kernels to customers.
  • Contribute to open-source projects focused on math kernel libraries or frameworks.

What we're looking for

  • Masters or PhD in Computer Science/Engineering or equivalent experience.
  • 3+ years of industry experience in relevant technical roles.
  • Proficiency in C++ programming and software design.
  • Experience with CUDA and other parallel programming languages.
  • Deep understanding of computer architecture, including assembly level work.
  • Experience writing code for NVIDIA Tensor Cores using PTX/CUDA/cuTile.
  • Contributions to open-source math kernel libraries or frameworks.

More like this

Similar roles

Senior Software Engineer, CUTLASS Platform

Nvidia

Santa Clara, CA 3 days ago $152,000$241,500
C++ CUDA Python MLIR NVVM PTX High-performance computing Compiler design Deep learning frameworks Computer architecture Parallel computing Performance optimization Software design Debugging Testing

Careers

Qualcomm

US 37 days ago
C Linux Kernel ARM CoreSight Windows Development Environment Visual Studio LLVM Compiler Windows Performance Analyzer Python Perl Assembly C++ Security Architecture CPU Architecture Memory and Bus Architecture Interprocessor Communications Reset Controller Hardware Crash Debug Sequence ETM Compiler Technology JIT Technologies

Senior System Software Engineer, Holoscan

Nvidia

Remote (Santa Clara, CA) 31 days ago $184,000$287,500
C/C++ Python Docker Bash CMake AI/ML LLM-based automation Cross-compilation Embedded systems Linux internals Security principles Vulnerability management Patch processes Yocto-based distributions Custom embedded Linux environments Medical AI applications Real-time sensor processing pipelines CI/CD
Remote

Software Engineer, Senior

Booz Allen Hamilton

MD 52 days ago $86,900$198,000
React Next.JS Git Jenkins GitLab CI/CD Express Flask Spring FastAPI Python Docker Kubernetes Elasticsearch Kibana Redis Kafka Nginx AWS HAProxy Grafana