Senior HPC and AI Networking Performance Research and Analysis Engineer

Nvidia

Actively hiring

Us, Ca, Santa Clara, US Posted 52 days ago $152,000–$241,500 / year

View original post Log in to save

At a glance

AI generated

TL;DR

As a Senior High Performance Computing (HPC) and AI Networking Performance Research and Analysis Engineer at NVIDIA, you will join the Performance group to profile and analyze AI workloads on large-scale GPU and CPU clusters for distributed deep learning LLM training, focusing on collectives communication and networking. Your day-to-day responsibilities include benchmarking, profiling, and analyzing performance bottlenecks across various hardware platforms like HCAs, switches, CPUs, GPUs, and systems, while developing performance analysis tools to optimize network communications using NVIDIA Collective Communications Library (NCCL). You will collaborate with cross-functional teams from hardware to software to provide critical insights for new technologies and solutions. Ideal candidates have a B.Sc in Computer Science or equivalent experience, 5+ years of high-performance networking expertise, proficiency in Python, Bash, C, Linux OS, CUDA, TensorFlow, PyTorch, and NCCL libraries, along with strong analytical skills and knowledge of congestion control algorithms and system architecture.

Skills

Python C Bash CUDA NCCL RDMA MPI TensorFlow PyTorch RoCE Linux Intel CPUs AMD CPUs ARM CPUs NVIDIA GPUs HCA PCI Performance Analysis CI/CD

What you'll do

Profile and analyze AI workloads on large GPU and CPU clusters for distributed deep learning training.
Develop performance analysis tools to identify bottlenecks in high-performance networking.
Implement methodologies to understand performance limitations and optimize collective communications.
Define performance test plans and set expectations for new technologies and solutions.
Collaborate with hardware and software teams to provide insights into performance analysis.

What we're looking for

5+ years experience with high-performance networking technologies (RDMA, MPI, NCCL)
Proficient in performance analysis methodologies and tool development
Expertise in NVIDIA GPUs, CUDA library, and deep learning frameworks like TensorFlow or PyTorch
Strong knowledge of AI workloads and benchmarking for distributed LLM training
Programming skills in Python, Bash, and C languages
Experience with Linux operating systems and various hardware platforms (Intel/AMD/ARM CPUs, GPUs)

Market check

Salary context

This $152,000–$241,500 range sits above 46% of similar postings on FindRole.

Peer median band

$155,420–$257,550

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$173,950–$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

Similar roles

Senior HPC Solutions Architect

Nvidia

Remote (Us, Ca, Santa Clara, US) 48 days ago $184,000–$287,500

Python C++ CUDA SLURM Linux BMC PCIe Network_Adapters InfiniBand DPU RoCE ARM Linux_Kernel Drivers SDN C

Remote

Senior AI Compute Engineer - NVIS

Nvidia

Remote (Us, Ca, Santa Clara, US) 45 days ago $148,000–$235,750

Linux Bash Python Ansible SLURM LSF UGE Kubernetes HPL NCCL MLPerf InfiniBand MPI Lustre GPFS BCM Terraform CI/CD

Remote

Senior Software Engineer, AI Networking

Nvidia

Us, Ca, Santa Clara, US 14 days ago $152,000–$241,500

Python PyTorch TensorFlow JAX CUDA NCCL Reinforcement_Learning Bayesian_Optimization GNNs Docker Kubernetes CI/CD Prometheus Grafana Bash C++ PostgreSQL Redis

Senior Data Center Performance Engineer - Benchmarking and Optimization

Nvidia

Remote (Us, Ca, Santa Clara, US) 8 days ago $184,000–$287,500

Python CUDA C++ Linux_perf Nsight_Systems PyTorch TensorFlow JAX MPI NCCL Docker Kubernetes SLURM NVIDIA_NVGX NVIDIA_HGX InfiniBand RoCE NVLink

Remote

Senior HPC Performance Engineer - AI for Science at Scale

Nvidia

Us, Ca, Santa Clara, US 100 days ago $184,000–$287,500

CUDA Python C++ PyTorch JAX Warp HPC Distributed Learning Atomistic Modeling CI/CD Git Linux NVIDIA DGX Systems GPU Programming Parallel Computing Data Structures Algorithm Design Machine Learning Frameworks Scientific AI Codebases Computational Chemistry Digital Biology

Senior AI and HPC Observability Engineer

Nvidia

Us, Ca, Santa Clara, US 87 days ago $152,000–$241,500

Python Go Java Kubernetes OpenTelemetry Prometheus Kafka Spark Flink PromQL Docker CI/CD Git Linux AWS GCP Azure