Principal AI Inference Systems Engineer

Amd

Hybrid

Quick summary

Work type: Hybrid
Location: Santa Clara, CA
Posted: 82 days ago
Closes: Apr 6, 2027
Nearby: 99+ roles within 25 mi

Market check

Salary context

How this pay compares to similar roles

Similar $210k

$163k most similar roles pay here $259k

This listing doesn't post a salary. Most similar roles pay $173,200–$246,150.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 56 open roles on FindRole.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · Principal AI Inference Systems Engineer

Role Posting Log in to save

AMD's Llama team seeks a Senior Staff AI Infra Engineer to lead technical initiatives and provide architectural guidance for optimizing AI/ML workloads on AMD GPUs. This role involves enhancing the performance of Large Language Models (LLMs) and Agentic AI systems through kernel, communication, and system-level optimizations. The ideal candidate will have 5+ years of experience in AI infrastructure and distributed systems, with expertise in C/C++ and Python, as well as a deep understanding of transformer architectures and frameworks like Megatron-LM and PyTorch Distributed. Familiarity with GPU architecture and tools such as ROCm, NCCL, and Kubernetes is essential, alongside strong problem-solving and communication skills to drive technical excellence and foster collaboration across teams.

Skills

Python C/C++ Kubernetes Ray Kubeflow Megatron-LM DeepSpeed PyTorch Distributed NCCL RCCL MPI GPU ROCm HIP Quantization Mixed-Precision TP/PP/DP/ZeRO Profiling Tools Performance-Analysis Tools

What you'll do

Lead technical initiatives and provide architectural guidance for AI/ML infrastructure.
Optimize LLM training and inference on AMD GPUs to enhance system efficiency.
Develop infrastructure supporting Large Language Models (LLMs) and Agentic AI systems.
Design and optimize AI workloads on GPU clusters, including large-scale orchestration.
Debug and resolve complex performance issues across GPU, network, and runtime layers.
Drive technical excellence and foster innovation within the organization.

What we're looking for

5+ years of experience in AI/ML infrastructure and performance-critical software development.
Expert proficiency in C/C++ and Python for AI/ML projects.
Solid understanding of transformer architectures and distributed training frameworks like Megatron-LM, DeepSpeed, PyTorch Distributed.
Proven experience optimizing LLM training and inference pipelines with parallelism techniques.
Hands-on experience designing and scaling AI platforms using Kubernetes, Ray, or Kubeflow.
Familiarity with GPU architecture and communication libraries for multi-GPU training optimization.
Demonstrated technical ownership and strong problem-solving skills in delivering end-to-end AI/ML solutions.

Similar roles

Principal AI Inference Systems Engineer

Amd

Santa Clara, CA 107 days ago

Kubernetes SLURM vLLM SGLang MPI Operator Volcano Kueue Kubeflow Training Operator GPU Operator NCCL RCCL RDMA CNI Prometheus Grafana Python CI/CD AMD Instinct GPUs

Save

AI Inference Performance Engineer

Nvidia

Santa Clara, CA 111 days ago $152,000–$241,500

Python C++ PyTorch JAX TensorRT-LLM vLLM SGLang CUDA MPI NCCL K8S CUTLASS cuteDSL tilelang OpenAI_Triton torch.compile GPU FPGA roofline_analysis performance_profiling

Hybrid

Save

Principal AI Engineer

Salesforce

New York +4 31 days ago $218,400–$365,200

Salesforce Distributed Systems CI/CD Infrastructure-as-Code API Integration AI Agents LLM Workflows Automated Testing Observability Event-Driven Design Microservices Security & Compliance Prompt Engineering System Context Design Evaluation Frameworks GitHub Copilot Claude Code Cursor Salesforce Marketing Cloud Agentforce Google Workspace Slack

Save

Principal AI Engineer

Salesforce

Remote (San Francisco, CA) +4 30 days ago $197,300–$313,700

AWS Python GitHub Actions ArgoCD Terraform Docker Kubernetes Grafana Braintrust LangSmith CI/CD AgentOps Salesforce Ecosystem Vector Databases Graph Databases RAG Pipelines Snowflake Kafka Flink

Remote

Save

Principal AI Performance Engineer

Amd

San Jose, CA 108 days ago

Python C++ vLLM SGLang TensorRT-LLM HIP CUDA Triton CK Linux GPU AI agents CI/CD PyTorch Kubernetes

Hybrid

Save