Senior Solutions Architect - Cluster Design and Architecture

Nvidia

Actively hiring
Us, Ca, Santa Clara, US Posted 125 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

As a Senior Solutions Architect at NVIDIA, you will join the Cluster Design and Architecture team, focusing on GPU, NVLink, and infrastructure design to assist with the creation of next-generation GPU-based clusters for advanced AI supercomputers and enterprise infrastructure. Your role involves partnering with internal engineering teams and field staff to guide customers through complex cluster designs, ensuring optimal performance and supportability. You will also work hands-on to resolve issues during new product deployments, provide feedback on design principles, and create customer-facing documentation. Essential skills include a degree in Computer Science or related fields, 8+ years of experience with GPU and HPC clusters, expertise in large-scale distributed systems, and proficiency with NVIDIA products such as GPUs and NVLink. Knowledge of NCCL, MPI, IMEX, NMX, and collectives is also required to support the development of cutting-edge AI and HPC solutions at scale.

Skills

GPU NVLink NVIDIA Networking NCCL MPI IMEX NMX Distributed training Cluster design HPC infrastructure AI clusters CI/CD Debugging Performance modeling Terraform AWS Kubernetes PostgreSQL

What you'll do

  • Partner with engineering teams on GPU cluster design and networking, conveying technical information to field teams and customers.
  • Guide field teams and customers in designing high-performance GPU clusters, considering complex situational limitations.
  • Assist in first deployments of new products by resolving issues related to cluster configuration and performance.
  • Provide feedback from customer interactions to internal engineering teams for improving product designs and documentation.
  • Support NPI customer deployments involving new GPU and networking architectures.
  • Translate sophisticated technical concepts into accessible documentation and reference materials for customers.

What we're looking for

  • 8+ years of experience in cluster design, validation, and issue resolution on GPU and HPC clusters
  • Proven expertise in designing large-scale distributed systems, AI clusters, or HPC infrastructure
  • Ability to translate engineering concepts into customer-ready documentation and reference material
  • Expertise in driving customer/partner issues to closure with product and engineering teams
  • Hands-on experience debugging deployment issues with NVIDIA GPUs, NVLink, and networking products
  • Knowledge of NCCL, MPI, IMEX, NMX, and collectives for distributed training in cluster designs
  • BS, MS, or PhD in Computer Science, Electrical Engineering, Physics, or related field

Market check

Salary context

This $184,000–$287,500 range sits above 85% of similar postings on FindRole.

Peer median band

$153,500$250,800

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$162,000$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Solutions Architect - Cluster Design and Architecture

Nvidia

Us, Ca, Santa Clara, US 128 days ago $184,000$287,500
GPU NVLink NVIDIA Networking NCCL MPI IMEX NMX Distributed training Cluster design HPC infrastructure AI clusters Debugging Customer documentation Multi-functional communications

Senior Solutions Architect - Data Center Infrastructure

Nvidia

Remote (Us, Ca, Santa Clara, US) 17 days ago $184,000$287,500
NVIDIA GPU Networking Deep Learning AI Infrastructure Hyperscaler Cloud Service Provider CSP Linux NCCL DCGM UFM APIs Embedded Linux Systems Hardware Demos System Designs Technical Training Sales Training Customer Support Problem Solving Data Analysis Logs Analysis
Remote

Senior Solutions Architect - Data Center Infrastructure

Nvidia

Remote (Us, Ca, Santa Clara, US) 17 days ago $152,000$241,500
NVIDIA GPU Networking Deep Learning Inference System Design Cloud Services Hyperscaler CSP OEM AI Market Technical Support Hardware Demos Software Libraries NCCL DCGM UFM Embedded Linux Systems APIs
Remote

Senior Solution Architect

Sony Group Corporation

Na / Culver City Corporate Pointe 40, US 27 days ago $158,808$160,000
Snowflake Python Airflow GitHub CI/CD Kubernetes Terraform AWS Azure Google Cloud Platform Docker PostgreSQL Redis MongoDB GitLab Jenkins Ansible Prometheus Grafana