Senior Software Development Engineer – LLM Inference Framework in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Quick summary

Work type: On-site
Location: Santa Clara, CA
Salary: $204,000–$204,000 / yr
Posted: 10 days ago
Closes: Jun 1, 2027
Nearby: 99+ roles within 25 mi

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $186k

This role $204k

$140k most similar roles pay here $241k

This role pays more than 56% of similar roles. Most pay $150,000–$222,000 — the shaded band above. At the midpoint, this role pays about $204k versus about $186k for comparable roles.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 71 open roles on FindRole.

Listed pay typically runs $178,400–$178,400 across 71 roles with salary data.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · Senior Software Development Engineer – LLM Inference Framework in Santa Clara, California | Advanced Micro Devices, Inc

Apply Now Log in to save

As a senior member of the LLM inference framework team, you will architect and optimize production-grade single-node and distributed inference runtimes for large language models on AMD GPUs, focusing on tensor parallelism, pipeline parallelism, expert parallelism (MoE), and multi-node inference at scale. Your daily tasks include driving performance and scalability improvements across GPU clusters, implementing efficient multi-node inference pipelines using RCCL and RDMA, and collaborating with kernel and compiler teams to enhance end-to-end performance. Key skills required are hands-on experience with vLLM, SGLang, or similar stacks, expertise in Python and C/C++, and a strong background in AMD GPU architectures and kernel development. This role also involves upstreaming features into open-source frameworks and enabling customer deployments on AMD platforms, making it ideal for systems-minded ML engineers who enjoy working at the intersection of inference engines, distributed systems, and GPU runtime backends.

Skills

Python C/C++ vLLM SGLang llm-d PyTorch TensorFlow AITER HIPBLAS-LT RCCL ROCm FP8 FP4 FlashAttention MLPerf CI/CD

What you'll do

Architect and optimize distributed LLM inference runtimes for single-node and multi-node deployments.
Design hybrid execution strategies including tensor parallelism, pipeline parallelism, and expert parallelism.
Implement efficient multi-node inference pipelines using RDMA and collective-based execution techniques.
Drive performance improvements in throughput, latency, and memory efficiency across GPU clusters.
Optimize continuous batching and speculative decoding for high-performance LLM serving.
Work with AMD GPU libraries to ensure efficient use of FP8/FP4 GEMM and FlashAttention.
Upstream features and performance fixes into open-source inference frameworks like vLLM and SGLang.

What we're looking for

Extensive experience with vLLM, SGLang, or similar inference stacks.
Proven track record of contributing to upstream open-source projects in distributed inference scaling.
Strong background in integrating optimized GPU performance into machine-learning frameworks like PyTorch and TensorFlow.
Expertise in Python and C/C++, including debugging and performance tuning for large-scale systems.
Experience optimizing large-scale workloads on heterogeneous GPU clusters for efficiency and scalability.
Master’s or PhD in Computer Science, Engineering, or a related field.

Similar roles

Technical Program Manager - SOC in Santa Clara, California | Advanced Micro Devices, Inc

Amd

Santa Clara, CA 31 days ago $187,120–$187,120

JIRA Confluence SharePoint Excel MS Office CI/CD Kubernetes AWS Terraform Python PostgreSQL Git Docker Prometheus Grafana

Save

Senior Software Development Engineer - Austin & Nashville

Oracle

Austin, TX +1 48 days ago $79,200–$178,100

Java Python Docker Kubernetes Terraform AWS CI/CD PostgreSQL Oracle REST JSON OAuth OpenAPI Spring Boot Git Jenkins Ansible Prometheus Grafana

Save

Senior Software Engineer, Applied AI, Tools & Infrastructure

Apple Inc

San Diego, CA 55 days ago $171,600–$302,200

Python FastAPI TypeScript React Next.js AWS EKS Helm Terraform Kubernetes OpenTelemetry Prometheus Grafana SQLAlchemy CI/CD GenAI LLM Multi-tenant platforms Agentic AI systems

Save