AI Cluster & Data Center Design Engineer

Amd

Quick summary

Work type
On-site
Location
Austin, TX
Posted
100 days ago
Closes
Mar 19, 2027

Market check

Salary context

How this pay compares to similar roles

Similar $189k
$121k most similar roles pay here $247k

This listing doesn't post a salary. Most similar roles pay $142,875–$235,187.

Based on 240 similar postings.

Employer

About Amd

AMD (Advanced Micro Devices) is a semiconductor company that develops high-performance processors, graphics cards, and adaptive computing solutions for gaming, data centers, and embedded markets. Industry: Semiconductors

Amd currently has 56 open roles on FindRole.

Most-posted roles

View all roles at Amd

At a glance

TL;DR · AI Cluster & Data Center Design Engineer

Join our dynamic team as a senior systems engineer specializing in the architecture and design of scalable AI/HPC clusters with a focus on rack and data center power delivery. You will evaluate and select optimal compute, storage, networking, and power components to ensure global deployment reliability and performance. Daily tasks include designing cutting-edge power solutions for high-density deployments, optimizing network topologies, and collaborating with cross-functional teams to deliver efficient infrastructure. Ideal candidates possess deep expertise in HPC, AI infrastructure, data center engineering, and are proficient in GPU/CPU architectures, PCIe, UALink, InfiniBand, Ethernet networking, and AI/ML frameworks. This role demands strong problem-solving skills and the ability to work effectively across diverse technical domains.

What you'll do

  • Design scalable AI/HPC clusters with optimized compute, storage, and networking components.
  • Evaluate and select CPUs, GPUs, accelerators, interconnects for optimal cluster performance.
  • Define power budgets, redundancy schemes, and fault tolerance mechanisms for high-density deployments.
  • Design network topologies to maximize overall cluster performance and efficiency.
  • Optimize storage solutions to enhance AI/HPC cluster performance using advanced technologies.

What we're looking for

  • Extensive experience in HPC, AI infrastructure, and data center systems engineering.
  • Deep technical knowledge of compute, power delivery, and networking components.
  • Ability to design scalable AI/HPC clusters with optimized performance and reliability.
  • Expertise in evaluating and selecting CPUs, GPUs, accelerators, interconnects, and memory configurations.
  • Strong understanding of rack and data center power delivery solutions and fault tolerance mechanisms.

More like this

Similar roles

Senior AI and ML HPC Cluster Engineer

Nvidia

Remote (Santa Clara, CA) +4 65 days ago $152,000$241,500
Slurm Kubernetes Docker Ansible Python Bash MPI NVIDIA GPUs CUDA NCCL PyTorch TensorFlow Lustre InfiniBand IPoIB RDMA CentOS RHEL Ubuntu Puppet Salt Singularity Podman Shifter Charliecloud
Remote

Careers at Qualcomm

Qualcomm

54 days ago
TensorFlow PyTorch PCIe CXL UAL Python Go Rack-scale architectures AI accelerators CPU/GPU architectures Disaggregated architectures DPUs AI inference at the edge CI/CD Kubernetes Docker PostgreSQL Prometheus Grafana

Workstation AI Architect

HP Inc.

Fort Collins, CO 32 days ago $147,050$230,850
AI ML Silicon Architecture CPU GPU NPU Heterogeneous Compute Accelerator Architectures On-Device AI Project Management CI/CD Cloud Services Databases Prometheus Grafana Python C++ Java JavaScript SQL

Data Center Operations and Design Engineer

Comcast

Northlake, IL 12 days ago $68,460$102,690
Data_center_design Troubleshooting Power_systems Routers Switches Servers PDU ATS UPS High_speed_data_networking Voice_IP_networking DCIM_tools Ticketing_systems MOPs_development Vendor_management Technical_documentation Microsoft_Office