Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

Nvidia

Remote

Quick summary

Work type
Remote
Location
Santa Clara, CA
Salary
$272,000–$431,250 / yr
Posted
146 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $213k
This role $352k
$128k most similar roles pay here $464k

This role pays more than 99% of similar roles. Most pay $183,636–$241,787 — the shaded band above. At the midpoint, this role pays about $352k versus about $213k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 855 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 843 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

As a Principal Systems Engineer at NVIDIA Dynamo, you will lead the design and evolution of a unified memory layer that spans GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote storage for large-scale LLM inference. You will architect deep integrations with leading LLM serving engines to optimize KV-cache offload and sharing across heterogeneous clusters, co-design interfaces for high-throughput, low-latency inference, and collaborate closely with GPU architecture teams to leverage technologies like GPUDirect and NVLink. With a strong background in building large-scale distributed systems and experience in C/C++ and Python, you will mentor engineers, set technical direction, and represent the team in internal reviews and external forums. This role requires expertise in memory hierarchies, networked I/O, RDMA/NVMe-oF technologies, and optimizing systems across CPU, GPU, memory, and network for high performance and efficiency.

What you'll do

  • Design a unified memory layer spanning multiple tiers for large-scale LLM inference.
  • Implement deep integrations with leading LLM serving engines focusing on KV-cache management.
  • Co-design interfaces enabling disaggregated prefill and peer-to-peer KV-cache sharing across clusters.
  • Exploit GPUDirect, RDMA, NVLink technologies to ensure low-latency cache access in heterogeneous systems.
  • Mentor engineers and set technical direction for memory and storage subsystems within the team.

What we're looking for

  • 15+ years experience building large-scale distributed systems and high-performance storage infrastructure.
  • Deep understanding of memory hierarchies including GPU HBM, host DRAM, SSD, and remote/object storage.
  • Hands-on experience with networked I/O technologies like RDMA/NVMe-oF/NVLink for low-latency data access.
  • Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network layers.
  • Experience designing unified memory or storage layers spanning multiple tiers for performance and cost efficiency.
  • Prior contributions to open-source projects focused on KV-cache optimization and reuse in LLM serving systems.

More like this

Similar roles

Principal Software Engineer, Data Architecture

Mastercard

O'Fallon, Missouri 76 days ago $170,000$281,000
AWS Azure GCP Databricks Snowflake Delta Lake Apache Spark Kafka Flink NiFi CI/CD Data Mesh GDPR ISO 20022 Agile SAFe Python Java SQL PostgreSQL

Principal Software Engineer, Distributed Systems

Alteryx

Remote (Northern California, Usa - Remote, US) 6 days ago $215,000$300,000
Kubernetes Java Python Node.js Kafka Redis API design Docker AWS Azure GCP Terraform CI/CD Prometheus Grafana GitOps Service Mesh Observability SRE DevOps Scalability Security Architecture Review Board
Remote

Principal Software Development Engineer, Solid State Drives

Nvidia

Remote (Santa Clara, CA) 4 days ago $248,000$391,000
C C++ SSD Flash Translation Layer NAND Backend optimization Storage systems architecture Cloud provider setting CI/CD Kubernetes Terraform AWS PostgreSQL Git JIRA Prometheus Grafana
Remote

Principal Software Engineer - Compute Infrastructure

Nvidia

Remote (Santa Clara, CA) 22 days ago $248,000$391,000
Kubernetes OpenShift Terraform Go Python GitOps ArgoCD AWS GCP NFSv4 NVMe/TCP Hyperconverged storage CI/CD Microservices Self-service architecture SLAs
Remote

Senior Principal Software Engineer

F5 Inc

San Jose, CA 21 days ago $234,400$351,600
Kubernetes OAuth RESTful Web services Docker Java Spring Framework Python Node.js Artificial Intelligence Machine Learning
Hybrid