Senior HPC Support Engineer, Compute and GPU Platform

Nvidia

Remote

Quick summary

Work type
Remote
Location
Santa Clara, CA
Salary
$108,000–$172,500 / yr
Posted
3 days ago

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $201k
This role $140k
$92k most similar roles pay here $258k

This role pays less than 89% of similar roles. Most pay $167,224–$235,750 — the shaded band above. At the midpoint, this role pays about $140k versus about $201k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 942 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 931 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior HPC Support Engineer, Compute and GPU Platform

As a Senior HPC Support Engineer specializing in Compute/GPU (DGX Platform) within our Technical Support team, you will be responsible for resolving complex customer issues related to AI hardware and software products. Your day-to-day tasks include debugging and troubleshooting technical problems via various communication channels, ensuring high customer satisfaction through meticulous research and problem-solving. You will collaborate closely with engineering, marketing, and support teams to provide feedback on product requirements and improve support processes. The ideal candidate has over five years of experience in customer support and debugging, proficiency in Linux system administration across multiple distributions like Red Hat Enterprise Linux and Ubuntu, and expertise in AI tools and technologies such as Docker, Kubernetes, and deep learning frameworks. Additionally, a strong background in networking, data centers, and distributed systems is essential for this role, which operates at the intersection of cutting-edge technology and customer service excellence.

What you'll do

  • Resolve complex customer issues on DGX Platform through detailed research and problem-solving.
  • Debug and respond to user-reported technical problems via various communication channels.
  • Develop and document standard methodologies for internal teams based on customer issue analysis.
  • Regularly interact with engineering, marketing, and support teams to provide feedback on product requirements.
  • Apply AI tools efficiently to share debugging results and create knowledge base articles.

What we're looking for

  • 5+ years of experience in customer support and debugging for hardware and software products.
  • Proven use of AI technologies in daily job responsibilities.
  • Deep understanding of Linux system administration on Red Hat Enterprise Linux and Ubuntu distributions.
  • Strong knowledge of at least two areas: data centers, servers, distributed systems, virtualization, deep learning frameworks, or containers/containerization.
  • Proficiency in shell scripting (Bash/Python) and networking technologies including InfiniBand, RDMA/RoCEv2, and GPU technology.
  • Experience with clustering/HPC data-center technologies, upper layer protocols like MPI, NCCL, and Ethernet/Distributed File System Storage technologies.

More like this

Similar roles

HPC Operations Engineer

Nvidia

Santa Clara, CA 81 days ago $124,000$195,500
Centos RHEL Docker Python bash Ansible NFS LDAP DNS TCP/IP SLURM FlexLM Perl InfiniBand RDMA RoCE Lustre GPFS
Hybrid

Senior HPC Cluster Engineer

Nvidia

Santa Clara, CA +2 13 days ago $152,000$241,500
Slurm Kubernetes Python Bash Docker Enroot Prometheus Grafana Linux RHEL Ubuntu NVIDIA_GPUs CUDA NCCL MPI InfiniBand RDMA RoCE Lustre GPFS Ansible MLPerf

Senior Solutions Architect, AI Compute

Nvidia

Remote (CA) +4 22 days ago $184,000$287,500
Linux Bash Python Ansible SLURM Kubernetes InfiniBand MPI Lustre GPFS HPL NCCL MLPerf NVIDIA BCM CI/CD
Remote

Senior AI Compute Engineer

Nvidia

Remote (Santa Clara, CA) 76 days ago $148,000$235,750
Linux Bash Python Ansible SLURM LSF UGE Kubernetes HPL NCCL MLPerf InfiniBand MPI Lustre GPFS BCM Terraform CI/CD
Remote

Software DevOps Engineer, Networking

Nvidia

Santa Clara, CA 23 days ago $148,000$235,750
Linux Docker Python C++ Bash GitLab CI/CD Ubuntu RHEL InfiniBand Ethernet gRPC gNMI REST JSON High-Speed Communication GPU Networking Firmware Drivers

Senior HPC Performance Engineer

Nvidia

Remote (OR) +3 72 days ago $184,000$287,500
Fortran C C++ OpenACC OpenMP MPI CUDA Performance_analysis Parallel_programming Linear_algebra Numerical_methods Assembly_language Debugging Porting
Remote