Fellow, AI Hardware Architecture and Software Optimization Engineer (Workload Optimization) in San Jose, California | Advanced Micro Devices, Inc

Amd

Actively hiring

US Posted 76 days ago $256,000–$256,000 / year

View original post Log in to save

At a glance

AI generated

TL;DR

Join the AI Software group at AMD as a Fellow and lead the end-to-end software optimization strategy to achieve unparalleled performance for top-tier customers. You will define technical visions and roadmaps, engage with key clients to solve critical performance issues, and collaborate across teams to influence future silicon features based on evolving AI workload trends. With deep expertise in AI frameworks like PyTorch and ROCm, you’ll optimize distributed inference and training at scale using tools such as TorchProfiler and Nsight. This role requires a visionary leader with 15+ years of software development experience, including 5 years in high-level technical leadership, and a strong background in modern model architectures and optimization techniques.

Skills

PyTorch JAX vLLM SGLang ROCm Distributed Systems Multi-node/Multi-GPU Performance Profiling TorchProfiler ROCM Profiler Nsight Transformer Models Attention Mechanisms Quantization Speculative Decoding FlashAttention Cross-functional Collaboration Deep Learning Large Language Models Computer Vision

What you'll do

Define and drive the end-to-end software optimization strategy for industry-leading performance.
Lead profiling, analysis, and tuning of large-scale AI models to ensure optimal performance on AMD hardware.
Engage with top-tier customers to understand unique workload requirements and deliver tailored optimizations.
Influence future silicon features by collaborating across hardware architecture and software teams.
Develop advanced tools and frameworks for performance estimation and automated reporting in the AI ecosystem.

What we're looking for

Over 15 years of software development experience with at least 5 years in high-level technical leadership.
Deep expertise in AI frameworks like PyTorch and ROCm software stack.
Proven history of optimizing distributed inference and training across multi-node/multi-GPU environments.
Mastery of performance profiling tools and hardware-level performance modeling techniques.
Strong understanding of modern model architectures and optimization techniques including quantization, speculative decoding.
Demonstrated ability to drive cross-functional initiatives in fast-paced, ambiguous environments.

Market check

Salary context

This $256,000–$256,000 range sits above 89% of similar postings on FindRole.

Peer median band

$164,540–$240,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$164,470–$236,600

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 80 open roles on FindRole.

Listed pay typically runs $176,400–$176,400 across 80 roles with salary data.

Most-posted roles

View all roles at Amd

Similar roles

Staff Software Development Engineer- GPU, LLM, AI in Santa Clara, California | Advanced Micro Devices, Inc

Amd

US 73 days ago $145,600–$145,600

C++ CUDA HIP ROCm PyTorch TensorFlow JAX LLMs Transformer architectures Attention mechanisms Mixture-of-Experts Quantization Speculative decoding Agentic AI Reinforcement Learning Supervised Fine-Tuning AMD ROCm Profiler NVIDIA Nsight CUBLAS cuDNN CUTLASS Thrust NCCL rocBLAS hipDNN

Sr. Fellow, ML Workload Performance in San Jose, California | Advanced Micro Devices, Inc

Amd

US 135 days ago $292,000–$292,000

Python C++ CUDA TensorFlow PyTorch AMD GPUs MLOps Distributed Systems CI/CD Performance Modeling Benchmarking LLMs Diffusion Models Multimodal Systems RecSys Generative AI Kernel Optimization Hardware-Software Co-design

Lead Gen AI / ML Engineer in Austin, Texas | Advanced Micro Devices, Inc

Amd

US 49 days ago $168,000–$168,000

Python Scikit-Learn PyTorch TensorFlow SQL MLOps ETL AWS Azure GCP CI/CD Langgraph Prometheus Kubernetes

Sr. Fellow Machine Learning Engineer in San Jose, California | Advanced Micro Devices, Inc

Amd

US 100 days ago $268,000–$268,000

Python PyTorch JAX TensorFlow TorchTitan Megatron-LM Distributed Systems AMD GPUs CI/CD Git C++ LLMs Recommendation Systems Ranking Models

Principal AI Performance Engineer in San Jose, California | Advanced Micro Devices, Inc

Amd

US 78 days ago $240,000–$240,000

Python C++ vLLM SGLang TensorRT-LLM HIP CUDA Triton CK Linux GPU AI agents CI/CD PyTorch Kubernetes

Hardware Development Engineer in San Jose, California | Advanced Micro Devices, Inc

Amd

US 99 days ago $160,000–$160,000

CadenceAllegro Python C C++ AnsysSiwave HFSS SolidWorks Sigrity AMD/XilinxFPGA 3DICTechnologies AdvantestT2000 93KATE HighSpeedPCBDesign PDNDesign MultiphasePowerRegulation SIPIsimulation