Senior HPC Cluster Engineer
Nvidia
At a glance
AI generatedAs a Senior Infrastructure Engineer on the Hardware Infrastructure EDA Compute team, you will manage and optimize large-scale job scheduling systems like LSF and Slurm to enhance design velocity and infrastructure efficiency. Your daily tasks include analyzing performance data to identify bottlenecks, leading problem-solving efforts across multiple layers, and implementing automation to reduce manual effort. You will also define metrics for service reliability, contribute to operational standards, and partner with customer teams to clarify requirements and drive issue resolution. The ideal candidate has at least five years of experience in large-scale Linux-based compute infrastructure management, proficiency in systems administration, and expertise in job scheduling system tuning and troubleshooting. Additionally, familiarity with container technologies and observability systems is beneficial, as you will work within a high-performance computing environment to address evolving compute demands.
Skills
What you'll do
What we're looking for
Market check
How this pay compares to similar roles
This role pays more than 61% of similar roles. Most pay $149,807–$223,700 — the shaded band above. At the midpoint, this role pays about $197k versus about $187k for comparable roles.
Based on 240 similar postings.
Employer
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 824 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 812 roles with salary data.
Most-posted roles
More like this
Nvidia
Arm Holdings
Nvidia
Nvidia
Lam Research
Nvidia