Senior Software Engineer, NCCL and CUDA - CSP Engagements

Nvidia

Remote Actively hiring Verified listing
Remote, US · Santa Clara, CA · Austin, TX · Redmond, WA · Seattle, WA Posted 10 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

NVIDIA is hiring a Senior Software Engineer with expertise in NCCL and CUDA to join its Cloud Service Provider Engagements team, focusing on enhancing ML software stack functionality and performance for datacenter products like GB300 and Vera Rubin. This role involves collaborating closely with CSPs to diagnose and resolve functional and performance issues within the libraries layer, optimizing multi-GPU workloads through profiling and benchmarking, and addressing complex GPU computation challenges. The ideal candidate will have extensive experience in parallel programming models, communication libraries such as MPI, NCCL, and NVSHMEM, and proficiency in CUDA development with strong C/C++ skills. Additionally, knowledge of high-performance networking, cloud deployment, and containerization tools is essential, along with a deep understanding of data-center system architecture and operating systems.

Skills

CUDA C/C++ MPI NCCL NVSHMEM Nsight nvprof PCIe NVLink InfiniBand RoCE Docker Kubernetes SLURM Ansible HPC Python Linux

What you'll do

  • Engage with CSPs to diagnose functional and performance issues in NCCL and CUDA libraries.
  • Analyze and enhance multi-GPU workload performance through profiling and benchmarking.
  • Resolve data movement challenges using NCCL and NVSHMEM in multi-node clusters.
  • Address CUDA porting problems for customer workloads to ensure compatibility.
  • Optimize datacenter scheduling and topologies for better GPU performance.
  • Debug complex issues related to GPU computation, memory, and communication protocols.

What we're looking for

  • Extensive experience with parallel programming models and communication libraries (MPI, NCCL, NVSHMEM).
  • Proficient in CUDA development and performance optimization using tools like Nsight and nvprof.
  • Deep understanding of data-center system architecture, including PCIe, NVLINK, InfiniBand, and RoCE.
  • Strong C/C++ programming skills with experience debugging complex GPU issues.
  • 8+ years of system software validation experience in a relevant field.
  • Ability to collaborate effectively with internal teams and customer support roles (AE, FAE).
  • Familiarity with cloud deployment tools such as Docker, Kubernetes, SLURM, and Ansible.

Market check

Salary context

This $184,000–$287,500 range sits above 84% of similar postings on FindRole.

Peer median band

$162,200$242,600

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$181,102$235,750

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior System Software Engineer - CUDA Chips

Nvidia

Us, Ca, Santa Clara, US 64 days ago $152,000$241,500
C CUDA Linux Windows macOS C++ Python Git CI/CD NVIDIA Pre-Silicon Simulation Emulation Kernel_Programming Operating_Systems Virtual_Memory Threads Process_Control Large_Codebases Documentation

Senior Software Engineer, CUDA Deep Learning Systems

Nvidia

Remote (Us, Ca, Santa Clara, US) 15 days ago $184,000$287,500
CUDA Python C++ PyTorch JAX TensorRT vLLM Nemo Megatron MaxText Triton XLA NCCL MPI UCX Docker CI/CD Git GitHub Linux PostgreSQL Prometheus Grafana
Remote