Lead Software Engineer, Fleet Management - DGX Cloud

Nvidia

Remote

Quick summary

Work type: Remote
Location: Seattle, WA · Santa Clara, CA
Salary: $224,000–$356,500 / yr
Posted: 47 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $209k

This role $290k

$167k most similar roles pay here $377k

This role pays more than 97% of similar roles. Most pay $187,390–$230,400 — the shaded band above. At the midpoint, this role pays about $290k versus about $209k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 985 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 971 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Lead Software Engineer, Fleet Management - DGX Cloud

Apply Now Log in to save

The Lead Software Engineer role at NVIDIA’s DGX Cloud team involves designing and leading the development of scalable cloud services for high-performance GPU infrastructure in datacenters. Day-to-day responsibilities include technical leadership over a team, creating RESTful APIs to ingest telemetry data, building and managing large-scale data pipelines, and optimizing operational efficiency across global cloud operations. The ideal candidate should have extensive experience with PostgreSQL-compatible databases, proficiency in Go or Python, familiarity with modern JavaScript frameworks like React or Angular, and expertise in cloud infrastructure such as AWS, GCP, Azure, Docker, and Kubernetes. Additionally, the role requires a deep understanding of high-scale distributed systems and strong communication skills to collaborate effectively on complex operational challenges within the fast-growing AI and cloud computing domain.

Skills

AWS Kubernetes Docker PostgreSQL Go Python React Angular Next.js CI/CD Linux RESTful APIs Terraform Prometheus Grafana

What you'll do

Act as technical lead for designing cloud services backed by databases and data warehouses.
Design and develop RESTful APIs to ingest telemetry from AI datacenters.
Build scalable cloud services for high-volume ingestion, processing, and storage of large datasets.
Build and manage data pipelines for online and offline data storage.
Optimize the reliability and efficiency of cloud services and operations.
Lead impactful technical projects ensuring quality and scalability at every stage.

What we're looking for

At least 12+ years of industry experience with a Bachelor’s or Master’s degree in a relevant field.
Expertise in building scalable REST APIs using Go or Python backed by PostgreSQL-compatible data stores.
Proficiency in modern JavaScript frameworks (React, Angular, Next.js) and cloud infrastructure technologies (AWS, GCP, Azure).
Deep knowledge of container technologies like Docker and Kubernetes, and high-scale distributed systems architecture.
Strong leadership experience in delivering scalable and efficient cloud services at Internet scale with a focus on reliability and efficiency.
Familiarity with Linux operating systems and hands-on experience operating NVIDIA datacenter GPUs.

Senior Technical Program Manager, DGX Cloud Software Products and Services

Nvidia

Santa Clara, CA 33 days ago $168,000–$258,750

Jira Aha! Confluence Git Distributed version control systems Reliability engineering Resilience development Service performance metrics Goodput Efficiency Utilization Distributed training frameworks Checkpointing NCCL Slurm AI infrastructure Large-scale compute platforms CI/CD

Save