Principal AI Performance Engineer in San Jose, California | Advanced Micro Devices, Inc

Amd

Hybrid

Quick summary

Work type: Hybrid
Location: San Jose, CA
Salary: $240,000–$240,000 / yr
Posted: 97 days ago
Closes: Mar 11, 2027
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $208k

This role $240k

$161k most similar roles pay here $255k

This role pays more than 70% of similar roles. Most pay $170,000–$246,150 — the shaded band above. At the midpoint, this role pays about $240k versus about $208k for comparable roles.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 64 open roles on FindRole.

Listed pay typically runs $190,000–$190,000 across 64 roles with salary data.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · Principal AI Performance Engineer in San Jose, California | Advanced Micro Devices, Inc

Apply Now Log in to save

AMD seeks a Principal AI Performance Engineer to lead a small technical team in optimizing AI inference performance on AMD GPUs for strategic customer engagements. This role involves end-to-end stack optimization of leading models and configurations, from profiling and diagnosing kernel-level bottlenecks to presenting optimizations to senior stakeholders. The ideal candidate has deep expertise in GPU computing, AI serving frameworks like vLLM and SGLang, and proficiency with Python and C++. They must excel at customer-facing technical leadership, leveraging AI agents daily to enhance workflows while developing reusable optimization methodologies. This position demands a performance-obsessed mindset, tackling complex challenges across multi-node distributed systems and leaving measurable impacts on AMD’s competitive edge in the AI market.

Skills

Python C++ vLLM SGLang TensorRT-LLM HIP CUDA Triton CK Linux GPU AI agents CI/CD PyTorch Kubernetes

What you'll do

Drive end-to-end performance optimization on AMD GPUs for leading AI models.
Profile and resolve complex cross-stack bottlenecks in GPU kernels and frameworks.
Diagnose kernel-level issues using profiling tools to enhance model performance.
Lead customer engagements by presenting technical findings and optimizations.
Develop custom kernels within serving frameworks to improve dispatch efficiency.
Optimize multi-node distributed inference for better communication-compute overlap.
Define and refine performance optimization methodologies for the broader team.

What we're looking for

7+ years of software development experience in GPU computing, AI systems, or high-performance computing.
Deep hands-on experience with AI serving frameworks and their internals, including vLLM, SGLang, TensorRT-LLM.
Strong background in end-to-end workload profiling and bottleneck diagnosis from user request to GPU kernel.
Expertise in GPU kernel performance characteristics such as occupancy, memory coalescing, cache utilization, and instruction-level bottlenecks.
Experience with custom kernel development or integration using HIP, CUDA, Triton, CK, or similar technologies.
Understanding of multi-GPU and multi-node distributed systems, including scale-up and scale-out topologies, RDMA, and communication-compute overlap.
Fluent in AI-assisted development, leveraging AI agents and tools daily to accelerate workflows.

Similar roles

Principal Software Development Eng. - AI Performance in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA 110 days ago $240,000–$240,000

CUDA HIP Python C++ LLVM MLIR Triton Gluon PyTorch vLLM SGLang xDiT Megatron LM Linux GPU HPC AI systems roofline analysis performance engineering multi-GPU communication

Hybrid

Save

Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Santa Clara, CA 96 days ago $237,200–$237,200

Kubernetes SLURM vLLM SGLang MPI Operator Volcano Kueue Kubeflow Training Operator GPU Operator NCCL RCCL RDMA CNI Prometheus Grafana Python CI/CD AMD Instinct GPUs

Save

Sr. Manager Software Development, AI Models and Applications in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA +1 147 days ago $221,600–$221,600

Pytorch JAX vLLM SGLang NeuRIPS CVPR ECCV ICCV ICML ICLR GPU CPU AIE Low-precision training Model quantization Parallelism strategies Efficient model architectures 3D World Model Agentic workflows

Hybrid

Save

Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Santa Clara, CA 71 days ago $226,400–$226,400

Python C/C++ Kubernetes Ray Kubeflow Megatron-LM DeepSpeed PyTorch Distributed NCCL RCCL MPI GPU ROCm HIP Quantization Mixed-Precision TP/PP/DP/ZeRO Profiling Tools Performance-Analysis Tools

Hybrid

Save

Fellow, AI Workload Optimization in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA 95 days ago $256,000–$256,000

PyTorch JAX vLLM SGLang ROCm Distributed Systems Multi-node/Multi-GPU Performance Profiling TorchProfiler ROCM Profiler Nsight Transformer Models Attention Mechanisms Quantization Speculative Decoding FlashAttention Cross-functional Collaboration Deep Learning Large Language Models Computer Vision

Hybrid

Save

Sr. Staff Software Engineer - AI Agentic Infrastructure & Systems in San Jose, California | Advanced Micro Devices, Inc

Amd

CA 83 days ago $204,000–$204,000

LangChain LangGraph AutoGen Python C/C++ Linux CI/CD SWE-bench LangSmith Docker Kubernetes Terraform AWS GitHub GitLab JTAG xutil dmesg Vitis AIE

Hybrid

Save