Senior Software Engineer - Datacenter Systems

Nvidia

Remote Actively hiring Verified listing
Remote, USA · Santa Clara, CA Posted 10 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

As a Senior Software Engineer - Datacenter Systems at NVIDIA, you will join the software infrastructure team to design and enhance systems for rack installation, networking configuration, and cluster management in large-scale GPU environments. Your daily tasks include developing scalable release train architectures, defining SLIs/SLOs/SLAs, creating intuitive UIs and APIs, and building CI/CD pipelines. You will also automate software updates, manage dependencies, and resolve operational issues to ensure high availability. The role requires expertise in Python, Rust, C++, Shell scripting, and familiarity with Kubernetes, Jenkins, GitLab, Ansible, and Prometheus. Ideal candidates have a background in managing infrastructure in distributed environments and experience with NVIDIA DGX systems and GPU clusters like GB200 and VR-NVL72.

Skills

Python Rust C++ Shell Kubernetes Jenkins GitLab Ansible GitOps Prometheus Grafana CI/CD Linux Slurm NVIDIA DGX systems Docker Terraform AWS Azure Google Cloud Platform

What you'll do

  • Develop software for hands-off datacenter provisioning and lifecycle management.
  • Build scalable release train architectures to enable independent release cycles.
  • Define and enforce SLIs, SLOs, and SLAs for core infrastructure services.
  • Lead technical requirement definition for new infrastructure features and improvements.
  • Create intuitive UIs and APIs for internal provisioning and management tools.
  • Automate software updates and monitor system health for reliability.

What we're looking for

  • 8+ years of experience managing infrastructure in high-performance or distributed environments.
  • Expertise in Python, Rust, C++, Shell programming, and modern CI/CD tools like Jenkins, GitLab, Ansible.
  • Strong understanding of Linux, networking, and building scalable distributed systems.
  • Demonstrated experience implementing SRE practices including SLIs, SLOs, and SLAs.
  • Proficiency with observability tools such as Prometheus and Grafana for system health monitoring.
  • Experience crafting user-facing components like front-end or CLI interfaces for infrastructure management tools.

Market check

Salary context

This $184,000–$287,500 range sits above 88% of similar postings on FindRole.

Peer median band

$120,500$222,480

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$142,400$215,681

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Datacenter Product Development Engineer

Nvidia

Us, Ca, Santa Clara, US 11 days ago $168,000$258,750
Python Shell PCIe InfiniBand Ethernet I3C I2C SPI USB HPC FPGA CPLD FW secure-boot encrypted images root-cause analysis test specifications HW testing automation log parsing Operations Research Industrial Engineering statistics

Senior Software Engineer - Data Infrastructure

Plaid

San Francisco Hq, US 71 days ago $190,800$262,800
Data Warehouses Data Lakehouses Apache Spark Workflow Orchestration Streaming Infrastructure Databricks Airflow AWS EMR Python CI/CD Kubernetes Terraform

Senior Software Engineer - Data Infrastructure

Plaid

Seattle Office, US 71 days ago $190,800$262,800
Data Warehouses Data Lakehouses Apache Spark Workflow Orchestration Streaming Infrastructure Databricks Airflow AWS EMR Python CI/CD

Senior Software Engineer, Cloud

Abbott

Remote (United States Of America : Remote, US) 31 days ago $86,700$173,300
Go SQL Server Postgres RESTful APIs microservices Kubernetes Docker Linux CI/CD JIRA Confluence Python AWS Git Terraform Prometheus Grafana
Remote