Principal System Software Engineer - Data Center MODS

Nvidia

Remote Actively hiring
Remote, US · Santa Clara, CA Posted 84 days ago $272,000$431,250 / year

At a glance

AI generated

TL;DR

The Data Center MODS organization is seeking a Principal Engineer to architect and scale next-generation diagnostic systems for Cloud Service Providers (CSPs), focusing on AI accelerator products. This role involves defining the technical roadmap, leading multi-functional development teams, and orchestrating large-scale stress testing across various hardware components. The Principal Engineer will also mentor engineering teams, drive root-cause analysis of systemic failures, and partner with CSPs to address scalability challenges. Essential skills include a deep understanding of distributed systems, proficiency in C++ or Python, expertise in x86/ARM architectures, Linux OS internals, firmware protocols, and software testing methodologies with an automation-first approach. This role demands extensive experience in technical leadership and setting strategic direction for complex projects within the high-scale data center infrastructure domain.

Skills

Python C++ Linux Redfish BMC UEFI BIOS HMC Distributed Systems Firmware Automation AI CI/CD

What you'll do

  • Define the technical roadmap for NVIDIA’s Data Center diagnostic systems.
  • Lead large-scale stress testing across CPUs, GPUs, networking, and memory.
  • Conduct root-cause analysis of systemic failures in hardware and software domains.
  • Mentor engineering teams to foster innovation and excellence.
  • Partner with CSPs to diagnose and address scalability challenges.

What we're looking for

  • Bachelor's degree in Computer Science/Engineering or equivalent experience.
  • Over 15 years of system software experience with C++ or Python.
  • Expertise in x86/ARM architectures and Linux OS internals.
  • Proven technical leadership in project teams and setting direction.
  • Deep knowledge of firmware, Redfish, HMC, BMC protocols, and platform security.
  • Experience in software testing methodologies with automation and AI-first approach.

Market check

Salary context

This $272,000–$431,250 range sits above 100% of similar postings on FindRole.

Peer median band

$143,600$234,100

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$148,500$227,862

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Principal Engineer, Data Center Power Software

Nvidia

Remote (Us, Wa, Redmond, US) 15 days ago $272,000$431,250
Go Kubernetes Rust Python Docker Prometheus Grafana React Node.js Helm Slurm CUDA Terraform CI/CD PostgreSQL AWS Azure
Remote

Principal Firmware Engineer - Data Center Server Management

Nvidia

Remote (Us, Ca, Santa Clara, US) 16 days ago $272,000$431,250
C C++ Python Git Jira x86 ARM BMC SCM Data_center_health_management Firmware_architecture Telemetry Server_manageability Cluster_bring_up Data_center_management
Remote

Senior Software Engineer - Datacenter Systems

Nvidia

Remote (Us, Ca, Santa Clara, US) 9 days ago $184,000$287,500
Python Rust C++ Shell Kubernetes Jenkins GitLab Ansible GitOps Prometheus Grafana CI/CD Linux Slurm NVIDIA DGX systems Docker Terraform AWS Azure Google Cloud Platform
Remote

Data Center Operating Engineer

JLL (Jones Lang LaSalle)

Remote (Usa-Client Totowa Nj, US) 30 days ago $100,380$100,380
Universal CFC certification HVAC electronics building automation systems UPS systems preventative maintenance programs SOP-driven operations volt meters drain augers plumbing tools safety goggles ear protection fire extinguishers algebra geometry load balancing practical problem solving
Remote

Data Center Engineer, Senior

Qualcomm

Atlanta, Ga,Us, US 161 days ago
Commscope iMvision Linux Windows OSX Microsoft Office Suite Outlook Word Excel iLO Terraform Ansible Puppet Chef Jira Confluence Prometheus Grafana Python Bash Cisco Ruckus VMware Docker Kubernetes AWS Azure Google Cloud Platform PostgreSQL MySQL MongoDB Git GitHub Bitbucket CI/CD