Staff Engineer, Compiler

Samsung Semiconductor

Quick summary

Work type: On-site
Location: San Jose, CA
Salary: $163,000–$253,000 / yr
Posted: 2 days ago

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $215k

This role $208k

$150k most similar roles pay here $271k

This role pays less than 54% of similar roles. Most pay $192,750–$236,900 — the shaded band above. At the midpoint, this role pays about $208k versus about $215k for comparable roles.

Based on 239 similar postings.

Employer

About Samsung Semiconductor

Samsung Semiconductor is the global semiconductor business unit of Samsung Electronics, designing and manufacturing memory chips, logic semiconductors, and foundry solutions for a broad range of applications.

Samsung Semiconductor currently has 27 open roles on FindRole.

Listed pay typically runs $163,000–$253,000 across 27 roles with salary data.

Most-posted roles

View all roles at Samsung Semiconductor

At a glance

TL;DR · Staff Engineer, Compiler

Apply Now Log in to save

Join our team as a Staff Compiler Engineer specializing in PyTorch and Kernel DSL development, where you will adapt torch.compile to fit our backend by lowering Inductor's IR to our hardware and defining fusion strategies. You’ll build or extend kernel DSLs for our unique hardware, design placement and scheduling passes, implement parallelism-aware lowering, and engage with upstream review processes for open-source projects like PyTorch and Triton. Ideal candidates have 3-5 years of experience in technologies such as MLIR, XLA, TVM, Inductor, or similar, along with a background in HPC, distributed systems, and non-flat memory hierarchies. Experience with kernel autotuning and open-source contributions is highly valued.

Skills

PyTorch MLIR Triton Helion Inductor XLA TVM IREE CUTLASS CUDA XPU ROCm MPS TPU kernel DSL HPC distributed systems NUMA-aware programming autotuning performance modeling cost-based compilation LLVM open-source contributions

What you'll do

Adapt torch.compile to backend by lowering Inductor's IR to hardware.
Build or extend kernel DSLs for custom hardware, deciding changes needed in frontend/backend.
Design placement and scheduling passes for distributed memory model optimization.
Implement parallelism-aware lowering for tensor, pipeline, expert, and sequence parallelism.
Contribute upstream to open-source projects like PyTorch, Triton, Helion, and MLIR.

What we're looking for

10+ years of industry experience in relevant fields or equivalent education and experience.
Experience designing a kernel DSL or making significant changes to an existing one.
Proficiency in MLIR, including writing dialects, passes, and backend integration.
Expertise in building PyTorch backends for non-CUDA accelerators like XPU, ROCm, TPU.
Knowledge of kernel autotuning, performance modeling, and cost-based compilation techniques.
Background in HPC, distributed systems, or NUMA-aware programming to understand non-flat memory hierarchies.
Open-source contributions to PyTorch, Triton, Helion, LLVM/MLIR, or similar projects.

Similar roles

Senior Deep Learning Compiler Engineer

Nvidia

Remote (Santa Clara, CA) 37 days ago $152,000–$241,500

MLIR XLA TVM LLVM PyTorch CUDA C++ Python GPU CPU Embedded_Systems Cross_Compilation CI/CD

Remote

Save

Senior Deep Learning Tools Engineer – CUDA Tile

Nvidia

Remote (Santa Clara, CA) 31 days ago $152,000–$241,500

Python C++ CI/CD PyTorch TensorFlow JAX TensorRT LLVM MLIR CUDA Docker Kubernetes Prometheus Grafana PostgreSQL Git GitHub Linux

Remote

Save

Machine Learning Compiler Engineer

Qualcomm

New York, NY 5 days ago $200,800–$301,200

MLIR LLVM Pytorch 2.0 TVM Triton SYCL Python C++ CUDA OpenCL Polyhedral Compiler Optimization Loop Transformation Vectorization GPU Programming High Performance Computing CI/CD Git Linux Docker

Save

Staff Machine Learning Engineer – AI/ML Compiler

Qualcomm

Santa Clara, CA 87 days ago $160,500–$240,700

Python C++ ONNX PyTorch MLIR LiteRT ONNXRuntime CI/CD Git ATen QNN QAIRT TVM Prometheus Grafana

Save

Staff ML Compiler Engineer

General Motors (GM)

Remote (Sunnyvale, CA) 2 days ago $185,100–$335,300

Python C++ MLIR ONNX TensorRT PyTorch TensorFlow JAX CUDA cuDNN cuBLAS CI/CD

Remote Hybrid

Save

Deep Learning Kernel Software Performance Architect - New College Grad 2026

Nvidia

Santa Clara, CA 51 days ago $124,000–$195,500

Python C C++ GPU CUDA Parallel_Programming Performance_Analysis Profiling Machine_Learning Deep_Learning Computer_Architecture High_Performance_Computing Energy_Efficient_Designs Analytical_Modeling NVIDIA_CUDA AI_Compiler

Save