Senior HPC Solutions Architect

Nvidia

Actively hiring
Remote (Us, Ca, Santa Clara, US) Posted 48 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA’s Solution Architects team as a senior networking professional supporting AI factory deployments for strategic customers. You will assist in deploying, debugging, and optimizing AI workloads on extensive NVIDIA platforms, identifying hardware issues, and collaborating with internal teams to resolve large-scale network challenges. Your responsibilities include benchmarking new features, analyzing performance data, and guiding customers in scaling workloads efficiently on the latest GPUs. This role requires strong programming skills in C++, Python, or similar languages, along with experience in CUDA, high-speed interconnects like InfiniBand, and Linux kernel drivers. You must have a deep understanding of CPU/GPU architectures, system-level server/rack architecture, and scheduling mechanisms such as SLURM, alongside excellent communication skills to liaise effectively with customers and partners.

Skills

Python C++ CUDA SLURM Linux BMC PCIe Network_Adapters InfiniBand DPU RoCE ARM Linux_Kernel Drivers SDN C

What you'll do

  • Assist in deploying and debugging AI workloads on NVIDIA platforms.
  • Identify and resolve hardware issues for customers and keep them informed.
  • Benchmark new framework features and share performance insights with stakeholders.
  • Solve cluster performance and stability issues directly with external clients.
  • Guide customers in scaling workloads efficiently on the latest NVIDIA GPUs.

What we're looking for

  • 10+ years of experience in designing, managing, and supporting large-scale hybrid networks.
  • Strong programming skills in C/C++ or Python.
  • Proven ability to identify and resolve bottlenecks in large-scale training workloads.
  • Deep understanding of CPU/GPU architectures, CUDA, parallel filesystems, and high-speed interconnects.
  • Experience working with compute clusters and their internal scheduling mechanisms like SLURM.
  • System-level knowledge of server/rack architecture, BMC, PCIe devices, network adapters, Linux OS, and kernel drivers.
  • Excellent communication skills for liaisoning with customers, partners, and internal teams.

Market check

Salary context

This $184,000–$287,500 range sits above 82% of similar postings on FindRole.

Peer median band

$161,650$251,150

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$167,950$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior HPC Storage Architect & Engineer

Lam Research

Fremont, Ca,Us, US 135 days ago $114,000$253,000
Lustre GPFS/Spectrum Scale VAST Data WEKA NetApp ONTAP FlexCache AWS Azure GCP InfiniBand RoCE NVMe-over-Fabrics SLURM xCAT Warewulf Ansible Terraform Python YAML Kubernetes CSI S3 IaC CI/CD

Senior HPC Performance Engineer

Nvidia

Remote (Us, Or, Remote, US) 41 days ago $184,000$287,500
Fortran C C++ OpenACC OpenMP MPI CUDA Performance_analysis Parallel_programming Linear_algebra Numerical_methods Assembly_language Debugging Porting
Remote

Senior HPC Storage Engineer

Nvidia

Us, Ca, Santa Clara, US 67 days ago $184,000$287,500
Python Docker Ceph Weka.io Vast Lustre GPFS CUDA NCCL PyTorch TensorFlow Bash CentOS RHEL Ubuntu SDN MLPerf NVIDIA GPUs HDDs SSDs NVMe

Senior CPU Performance Architect

Nvidia

Us, Ca, Santa Clara, US 50 days ago $224,000$356,500
Python C++ ARM PyTorch NVIDIA GPU HPC AI DL CI/CD Linux Performance_Benchmarking CPU_Microarchitecture System_Architecture Simulator Multi_Core_Systems Interconnect_Architecture Performance_Optimization Benchmarking ISA

Senior Accelerated Computing Architect

Nvidia

Us, Ca, Santa Clara, US 21 days ago $184,000$287,500
CUDA C++ C MPI OpenSHMEM Python Linux GPU CPU Benchmarking Profiling IPC_APIs OpenCL NVSHMEM

HPC Systems Administration Specialist

Argonne National Laboratory

Lemont, Il Usa, US 120 days ago $69,750$108,810
Linux Spack Lmod Singularity Version control systems Compilers GCC Intel LLVM Make CMake Autotools Python CI pipelines YAML Podman MPI CUDA BLAS FFTW