Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Hybrid

Quick summary

Work type
Hybrid
Location
Santa Clara, CA
Salary
$226,400–$226,400 / yr
Posted
66 days ago
Closes
Apr 6, 2027

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $215k
This role $226k
$177k most similar roles pay here $257k

This role pays more than 55% of similar roles. Most pay $184,712–$246,150 — the shaded band above. At the midpoint, this role pays about $226k versus about $215k for comparable roles.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 71 open roles on FindRole.

Listed pay typically runs $178,400–$178,400 across 71 roles with salary data.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · Principal AI Inference Systems Engineer in Santa Clara, California | Advanced Micro Devices, Inc

AMD's Llama team seeks a Senior Staff AI Infra Engineer to lead technical initiatives and provide architectural guidance for optimizing AI/ML workloads on AMD GPUs. This role involves enhancing the performance of Large Language Models (LLMs) and Agentic AI systems through kernel, communication, and system-level optimizations. The ideal candidate will have 5+ years of experience in AI infrastructure and distributed systems, with expertise in C/C++ and Python, as well as a deep understanding of transformer architectures and frameworks like Megatron-LM and PyTorch Distributed. Familiarity with GPU architecture and tools such as ROCm, NCCL, and Kubernetes is essential, alongside strong problem-solving and communication skills to drive technical excellence and foster collaboration across teams.

What you'll do

  • Lead technical initiatives and provide architectural guidance for AI/ML infrastructure.
  • Optimize LLM training and inference on AMD GPUs to enhance system efficiency.
  • Develop infrastructure supporting Large Language Models (LLMs) and Agentic AI systems.
  • Design and optimize AI workloads on GPU clusters, including large-scale orchestration.
  • Debug and resolve complex performance issues across GPU, network, and runtime layers.
  • Drive technical excellence and foster innovation within the organization.

What we're looking for

  • 5+ years of experience in AI/ML infrastructure and performance-critical software development.
  • Expert proficiency in C/C++ and Python for AI/ML projects.
  • Solid understanding of transformer architectures and distributed training frameworks like Megatron-LM, DeepSpeed, PyTorch Distributed.
  • Proven experience optimizing LLM training and inference pipelines with parallelism techniques.
  • Hands-on experience designing and scaling AI platforms using Kubernetes, Ray, or Kubeflow.
  • Familiarity with GPU architecture and communication libraries for multi-GPU training optimization.
  • Demonstrated technical ownership and strong problem-solving skills in delivering end-to-end AI/ML solutions.

More like this

Similar roles