AI Cluster & Data Center Design Engineer

Amd

Quick summary

Work type: On-site
Location: Austin, TX
Posted: 100 days ago
Closes: Mar 19, 2027
Nearby: 99+ roles within 25 mi

Market check

Salary context

How this pay compares to similar roles

Similar $189k

$121k most similar roles pay here $247k

This listing doesn't post a salary. Most similar roles pay $142,875–$235,187.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 56 open roles on FindRole.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · AI Cluster & Data Center Design Engineer

Role Posting Log in to save

Join our dynamic team as a senior systems engineer specializing in the architecture and design of scalable AI/HPC clusters with a focus on rack and data center power delivery. You will evaluate and select optimal compute, storage, networking, and power components to ensure global deployment reliability and performance. Daily tasks include designing cutting-edge power solutions for high-density deployments, optimizing network topologies, and collaborating with cross-functional teams to deliver efficient infrastructure. Ideal candidates possess deep expertise in HPC, AI infrastructure, data center engineering, and are proficient in GPU/CPU architectures, PCIe, UALink, InfiniBand, Ethernet networking, and AI/ML frameworks. This role demands strong problem-solving skills and the ability to work effectively across diverse technical domains.

Skills

HPC AI Data_center_engineering CPUs GPUs Accelerators PCIe InfiniBand Ethernet Lustre Ceph CI/CD

What you'll do

Design scalable AI/HPC clusters with optimized compute, storage, and networking components.
Evaluate and select CPUs, GPUs, accelerators, interconnects for optimal cluster performance.
Define power budgets, redundancy schemes, and fault tolerance mechanisms for high-density deployments.
Design network topologies to maximize overall cluster performance and efficiency.
Optimize storage solutions to enhance AI/HPC cluster performance using advanced technologies.

What we're looking for

Extensive experience in HPC, AI infrastructure, and data center systems engineering.
Deep technical knowledge of compute, power delivery, and networking components.
Ability to design scalable AI/HPC clusters with optimized performance and reliability.
Expertise in evaluating and selecting CPUs, GPUs, accelerators, interconnects, and memory configurations.
Strong understanding of rack and data center power delivery solutions and fault tolerance mechanisms.

Similar roles

Senior AI and ML HPC Cluster Engineer

Nvidia

Remote (Santa Clara, CA) +4 65 days ago $152,000–$241,500

Slurm Kubernetes Docker Ansible Python Bash MPI NVIDIA GPUs CUDA NCCL PyTorch TensorFlow Lustre InfiniBand IPoIB RDMA CentOS RHEL Ubuntu Puppet Salt Singularity Podman Shifter Charliecloud

Remote

Save

Senior Solutions Architect, AI Cluster Performance and Telemetry

Nvidia

Santa Clara, CA +1 24 days ago $184,000–$287,500

Perf eBPF Prometheus Grafana Docker Kubernetes SLURM Ansible NCCL NVIDIA Nsight Python C++ CUDA TensorFlow PyTorch CI/CD

Save

Careers at Qualcomm

Qualcomm

54 days ago

TensorFlow PyTorch PCIe CXL UAL Python Go Rack-scale architectures AI accelerators CPU/GPU architectures Disaggregated architectures DPUs AI inference at the edge CI/CD Kubernetes Docker PostgreSQL Prometheus Grafana

Save

Workstation AI Architect

HP Inc.

Fort Collins, CO 32 days ago $147,050–$230,850

AI ML Silicon Architecture CPU GPU NPU Heterogeneous Compute Accelerator Architectures On-Device AI Project Management CI/CD Cloud Services Databases Prometheus Grafana Python C++ Java JavaScript SQL

Save

Datacenter AI Systems and Solutions Engineer, Senior Staff

Qualcomm

San Diego, CA 18 days ago $162,600–$244,000

Python Docker Kubernetes MLOps GitOps CI/CD Prometheus Grafana PostgreSQL Redis Slurm Apache Kafka OpenAPI Swagger Terraform Ansible Jenkins GitHub GitLab Bitbucket Travis CI CircleCI

Save

Data Center Operations and Design Engineer

Comcast

Northlake, IL 12 days ago $68,460–$102,690

Data_center_design Troubleshooting Power_systems Routers Switches Servers PDU ATS UPS High_speed_data_networking Voice_IP_networking DCIM_tools Ticketing_systems MOPs_development Vendor_management Technical_documentation Microsoft_Office

Save