Senior Storage Production Engineer - DGX Cloud

Nvidia

Remote

Quick summary

Work type
Remote
Location
Santa Clara, CA
Salary
$176,000–$276,000 / yr
Posted
2 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $194k
This role $226k
$137k most similar roles pay here $291k

This role pays more than 69% of similar roles. Most pay $151,937–$235,750 — the shaded band above. At the midpoint, this role pays about $226k versus about $194k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 980 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 966 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior Storage Production Engineer - DGX Cloud

NVIDIA seeks a senior Storage Production Engineer to join its team, responsible for ensuring the reliability and efficiency of large-scale GPU cloud services. This role involves designing and supporting scalable storage clusters, developing monitoring systems, optimizing data access patterns for AI/ML workloads, and implementing automated fault detection mechanisms. The ideal candidate will have extensive experience with distributed storage solutions, proficiency in languages like C/C++, Python, and Bash, and expertise in tools such as Ansible, Terraform, Prometheus, and Grafana. Knowledge of Kubernetes, OpenStack, and hybrid cloud architectures is beneficial, along with a strong background in capacity planning, performance tuning, and troubleshooting high-throughput systems.

What you'll do

  • Design and implement large-scale storage clusters ensuring scalability and high availability.
  • Develop and maintain monitoring systems to proactively detect and resolve performance issues.
  • Optimize storage architectures for low-latency access in AI/ML workloads.
  • Improve lifecycle of storage services from inception to continuous optimization.
  • Maintain production infrastructure by supervising system health and leveraging predictive analytics.

What we're looking for

  • BS degree or equivalent in Computer Science with 8+ years of practical experience
  • Expertise in distributed and high-performance storage solutions and networking protocols
  • Proficiency in automation tools like Ansible, Terraform, and observability tools
  • Hands-on experience with C/C++, Python, Go for storage automation and performance tuning
  • Deep understanding of block, file, and object storage technologies and their characteristics
  • Experience in capacity planning, performance tuning, and troubleshooting large-scale systems
  • Knowledge of Kubernetes, OpenStack, and hybrid cloud architectures for automated storage solutions

More like this

Similar roles

Senior Production Engineer - DGX Cloud

Nvidia

Remote (CA) +4 17 days ago $168,000$270,250
Kubernetes Python Go Docker CI/CD Prometheus Grafana Terraform AWS Azure Slurm Bright_Cluster_Manager PostgreSQL Redis Git Jenkins Ansible Zabbix Nagios Fluentd
Remote

Senior Software Engineer, DGXC Data Services

Nvidia

Remote (Santa Clara, CA) 4 days ago $152,000$241,500
Kubernetes AWS GCP Azure Python Go C++ Java Apache Spark Object Storage Metadata Management Data lake tools Apache Iceberg Machine Learning infrastructure toolset Feature Stores Docker CI/CD Terraform PostgreSQL
Remote

Senior Storage Engineer

Pacific Life

Newport Beach, CA 36 days ago $137,610$168,190
PURE NetApp VMware Brocade SAN fabric switches AWS Azure Google Cloud Hyper Converged Infrastructure CI/CD Linux Windows SAN/NAS architectures Docker Kubernetes