Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

Nvidia

Remote

Quick summary

Work type: Remote
Location: Santa Clara, CA
Salary: $272,000–$431,250 / yr
Posted: 146 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $213k

This role $352k

$128k most similar roles pay here $464k

This role pays more than 99% of similar roles. Most pay $183,636–$241,787 — the shaded band above. At the midpoint, this role pays about $352k versus about $213k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 855 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 843 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

Apply Now Log in to save

As a Principal Systems Engineer at NVIDIA Dynamo, you will lead the design and evolution of a unified memory layer that spans GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote storage for large-scale LLM inference. You will architect deep integrations with leading LLM serving engines to optimize KV-cache offload and sharing across heterogeneous clusters, co-design interfaces for high-throughput, low-latency inference, and collaborate closely with GPU architecture teams to leverage technologies like GPUDirect and NVLink. With a strong background in building large-scale distributed systems and experience in C/C++ and Python, you will mentor engineers, set technical direction, and represent the team in internal reviews and external forums. This role requires expertise in memory hierarchies, networked I/O, RDMA/NVMe-oF technologies, and optimizing systems across CPU, GPU, memory, and network for high performance and efficiency.

Skills

Rust Python C/C++ Docker Kubernetes Terraform AWS CI/CD PostgreSQL Prometheus Grafana Git GitHub LLM serving engines vLLM SGLang TensorRT-LLM GPUDirect RDMA NVLink NVIDIA Networking Disaggregated deployments AI clusters Networked I/O RDMA/NVMe-oF Unified memory layers KV-cache optimization Compression Streaming Reuse

What you'll do

Design a unified memory layer spanning multiple tiers for large-scale LLM inference.
Implement deep integrations with leading LLM serving engines focusing on KV-cache management.
Co-design interfaces enabling disaggregated prefill and peer-to-peer KV-cache sharing across clusters.
Exploit GPUDirect, RDMA, NVLink technologies to ensure low-latency cache access in heterogeneous systems.
Mentor engineers and set technical direction for memory and storage subsystems within the team.

What we're looking for

15+ years experience building large-scale distributed systems and high-performance storage infrastructure.
Deep understanding of memory hierarchies including GPU HBM, host DRAM, SSD, and remote/object storage.
Hands-on experience with networked I/O technologies like RDMA/NVMe-oF/NVLink for low-latency data access.
Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network layers.
Experience designing unified memory or storage layers spanning multiple tiers for performance and cost efficiency.
Prior contributions to open-source projects focused on KV-cache optimization and reuse in LLM serving systems.

Similar roles

Principal Software Engineer, Data Architecture

Mastercard

O'Fallon, Missouri 76 days ago $170,000–$281,000

AWS Azure GCP Databricks Snowflake Delta Lake Apache Spark Kafka Flink NiFi CI/CD Data Mesh GDPR ISO 20022 Agile SAFe Python Java SQL PostgreSQL

Save

Principal Software Engineer, Distributed Systems

Alteryx

Remote (Northern California, Usa - Remote, US) 6 days ago $215,000–$300,000

Kubernetes Java Python Node.js Kafka Redis API design Docker AWS Azure GCP Terraform CI/CD Prometheus Grafana GitOps Service Mesh Observability SRE DevOps Scalability Security Architecture Review Board

Remote

Save

Principal Software Development Engineer, Solid State Drives

Nvidia

Remote (Santa Clara, CA) 4 days ago $248,000–$391,000

C C++ SSD Flash Translation Layer NAND Backend optimization Storage systems architecture Cloud provider setting CI/CD Kubernetes Terraform AWS PostgreSQL Git JIRA Prometheus Grafana

Remote

Save

Sr. Staff / Principal Software Engineer – Linux Kernel & ARM Server Platforms

Qualcomm

Santa Clara, CA 14 days ago $211,800–$317,800

Linux C ARM ACPI UEFI SystemReady SBSA PSCI PCIe Python Go Docker CI/CD Kubernetes Terraform AWS PostgreSQL

Save

Principal Software Engineer - Compute Infrastructure

Nvidia

Remote (Santa Clara, CA) 22 days ago $248,000–$391,000

Kubernetes OpenShift Terraform Go Python GitOps ArgoCD AWS GCP NFSv4 NVMe/TCP Hyperconverged storage CI/CD Microservices Self-service architecture SLAs

Remote

Save

Senior Principal Software Engineer

F5 Inc

San Jose, CA 21 days ago $234,400–$351,600

Kubernetes OAuth RESTful Web services Docker Java Spring Framework Python Node.js Artificial Intelligence Machine Learning

Hybrid

Save