Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Hybrid

Quick summary

Work type: Hybrid
Location: Santa Clara, CA
Salary: $226,400–$226,400 / yr
Posted: 66 days ago
Closes: Apr 6, 2027
Nearby: 99+ roles within 25 mi

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $215k

This role $226k

$177k most similar roles pay here $257k

This role pays more than 55% of similar roles. Most pay $184,712–$246,150 — the shaded band above. At the midpoint, this role pays about $226k versus about $215k for comparable roles.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 71 open roles on FindRole.

Listed pay typically runs $178,400–$178,400 across 71 roles with salary data.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

Apply Now Log in to save

AMD's Llama team seeks a Senior Staff AI Infra Engineer to lead technical initiatives and provide architectural guidance for optimizing AI/ML workloads on AMD GPUs. This role involves enhancing the performance of Large Language Models (LLMs) and Agentic AI systems through kernel, communication, and system-level optimizations. The ideal candidate will have 5+ years of experience in AI infrastructure and distributed systems, with expertise in C/C++ and Python, as well as a deep understanding of transformer architectures and frameworks like Megatron-LM and PyTorch Distributed. Familiarity with GPU architecture and tools such as ROCm, NCCL, and Kubernetes is essential, alongside strong problem-solving and communication skills to drive technical excellence and foster collaboration across teams.

Skills

Python C/C++ Kubernetes Ray Kubeflow Megatron-LM DeepSpeed PyTorch Distributed NCCL RCCL MPI GPU ROCm HIP Quantization Mixed-Precision TP/PP/DP/ZeRO Profiling Tools Performance-Analysis Tools

What you'll do

Lead technical initiatives and provide architectural guidance for AI/ML infrastructure.
Optimize LLM training and inference on AMD GPUs to enhance system efficiency.
Develop infrastructure supporting Large Language Models (LLMs) and Agentic AI systems.
Design and optimize AI workloads on GPU clusters, including large-scale orchestration.
Debug and resolve complex performance issues across GPU, network, and runtime layers.
Drive technical excellence and foster innovation within the organization.

What we're looking for

5+ years of experience in AI/ML infrastructure and performance-critical software development.
Expert proficiency in C/C++ and Python for AI/ML projects.
Solid understanding of transformer architectures and distributed training frameworks like Megatron-LM, DeepSpeed, PyTorch Distributed.
Proven experience optimizing LLM training and inference pipelines with parallelism techniques.
Hands-on experience designing and scaling AI platforms using Kubernetes, Ray, or Kubeflow.
Familiarity with GPU architecture and communication libraries for multi-GPU training optimization.
Demonstrated technical ownership and strong problem-solving skills in delivering end-to-end AI/ML solutions.

Similar roles

Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Santa Clara, CA 91 days ago $237,200–$237,200

Kubernetes SLURM vLLM SGLang MPI Operator Volcano Kueue Kubeflow Training Operator GPU Operator NCCL RCCL RDMA CNI Prometheus Grafana Python CI/CD AMD Instinct GPUs

Save

Lead Gen AI / ML Engineer in Austin, Texas | Advanced Micro Devices, Inc

Amd

Austin, TX 64 days ago $168,000–$168,000

Python Scikit-Learn PyTorch TensorFlow SQL MLOps ETL AWS Azure GCP CI/CD Langgraph Prometheus Kubernetes

Hybrid

Save

Fellow, AI Workload Optimization in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA 90 days ago $256,000–$256,000

PyTorch JAX vLLM SGLang ROCm Distributed Systems Multi-node/Multi-GPU Performance Profiling TorchProfiler ROCM Profiler Nsight Transformer Models Attention Mechanisms Quantization Speculative Decoding FlashAttention Cross-functional Collaboration Deep Learning Large Language Models Computer Vision

Hybrid

Save

Principal AI Performance Engineer in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA 92 days ago $240,000–$240,000

Python C++ vLLM SGLang TensorRT-LLM HIP CUDA Triton CK Linux GPU AI agents CI/CD PyTorch Kubernetes

Hybrid

Save

Principal Software Development Eng. - AI Performance in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA 105 days ago $240,000–$240,000

CUDA HIP Python C++ LLVM MLIR Triton Gluon PyTorch vLLM SGLang xDiT Megatron LM Linux GPU HPC AI systems roofline analysis performance engineering multi-GPU communication

Hybrid

Save

Principal GenAI Inference Optimization Engineer in San Jose, California | Advanced Micro Devices, Inc

Amd

San Jose, CA 76 days ago $240,000–$240,000

Python C++ CUDA HIP vLLM SGLang Triton TensorRT-LLM PyTorch JAX TensorFlow AMD GPUs PCIe RDMA Distributed systems Profiling tools Benchmarking tools Performance analysis tools CI/CD

Hybrid

Save