Principal Software Engineer, GPU Firmware and GPU System Software

Nvidia

Remote

Quick summary

Work type
Remote
Location
Santa Clara, CAAustin, TXOR
Salary
$272,000–$431,250 / yr
Posted
3 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $210k
This role $352k
$133k most similar roles pay here $463k

This role pays more than 99% of similar roles. Most pay $183,987–$235,750 — the shaded band above. At the midpoint, this role pays about $352k versus about $210k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 966 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 955 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Principal Software Engineer, GPU Firmware and GPU System Software

As a Principal Software Engineer on the CSP Engagements team, you will serve as the technical liaison for GPU firmware and system software, working closely with engineering teams of key cloud service providers to ensure reliable firmware management and updates across large-scale deployments. Your responsibilities include driving work streams to build shared understanding of GPU firmware architecture, incorporating customer feedback into NVIDIA’s feature roadmap, and ensuring robust automation and recovery procedures are in place before each release. You will also identify cross-CSP patterns in operational challenges to drive systemic improvements. The ideal candidate has over 15 years of experience with deep knowledge of GPU architecture internals, multi-GPU fabric architectures, firmware update lifecycle management at scale, and GPU security features like secure boot and multi-tenancy isolation. This role requires expertise in NVIDIA VBIOS, microcontroller firmware, InfoROM, and familiarity with fleet-scale GPU management challenges.

What you'll do

  • Drive technical work streams with CSP engineering teams to ensure understanding of GPU firmware architecture.
  • Gather and synthesize customer feedback on GPU firmware/software requirements for NVIDIA's roadmap.
  • Manage GPU firmware update orchestration for large-scale deployments, including rollback strategies.
  • Serve as the primary technical contact between NVIDIA and CSPs for GPU behavior documentation.
  • Identify cross-CSP patterns in GPU SW/FW issues to drive systemic improvements.

What we're looking for

  • 15+ years experience in GPU system software or firmware engineering.
  • Deep understanding of GPU architecture internals and firmware/driver interactions.
  • Experience managing firmware updates at large-scale deployments (multi-device, A/B updates).
  • Expertise in GPU error handling, recovery flows, and health monitoring telemetry.
  • Direct experience with NVIDIA GPU VBIOS, microcontroller firmware, and driver stack.
  • Background in fleet management of 10K+ GPUs for rollout and remediation processes.
  • Understanding of GPU security features including secure boot, code signing, and attestation.

More like this

Similar roles

Senior Software Engineer, NCCL and CUDA

Nvidia

Remote (Santa Clara, CA) +3 10 days ago $184,000$287,500
CUDA C/C++ MPI NCCL NVSHMEM Nsight nvprof PCIe NVLink InfiniBand RoCE Docker Kubernetes SLURM Ansible HPC Python Deep Learning CI/CD
Remote

Senior Firmware Engineer, GPU

Nvidia

Remote (SC) 38 days ago $152,000$241,500
C SPI I2C I3C PCIe SMBus MCTP PLDM RISC-V Ada SPARK Git JIRA Confluence CI/CD Python PostgreSQL
Remote

Firmware Infrastructure Engineer, GPU

Nvidia

Santa Clara, CA 66 days ago $152,000$241,500
Python C CI/CD SQL PostgreSQL AWS Kubernetes Docker Git GitHub Jenkins Terraform Prometheus Grafana Linux Windows BIOS Firmware Threat_Modeling Static_Analysis Dynamic_Analysis

Principal Software Engineer, At-Scale Reliability and Fleet Intelligence

Nvidia

Santa Clara, CA 3 days ago $272,000$431,250
Pareto Weibull time-series databases anomaly detection health scoring event correlation NVIDIA GPU error taxonomy Xid errors NVLink error counters thermal events CPER records predictive failure models fleet reliability MTBF MTBI burn-in testing stress testing certification frameworks hardware health telemetry pipelines

System Software Engineer, GPU Development Tools

Nvidia

Santa Clara, CA +1 84 days ago $124,000$195,500
C++ Python CUDA OpenGL Vulkan Docker Kubernetes CI/CD Git Linux Virtual Machines Containers Prometheus Grafana PostgreSQL AWS Azure Google Cloud Platform