Senior Manager, GPU Cloud Infrastructure - GeForce NOW

Nvidia

Actively hiring
Us, Ca, Santa Clara, US Posted 51 days ago $256,000$414,000 / year

At a glance

AI generated

TL;DR

As a Senior Manager at GeForce NOW, you will lead and mentor a specialized team of network architects responsible for designing high-performance GPU infrastructure. Your day-to-day involves overseeing the creation of intra-cluster and inter-cluster connectivity using RoCE, Ethernet-based AI fabrics, and other cutting-edge technologies to ensure ultra-low latency and high throughput across data centers. You will also work closely with ISPs to optimize edge networks and collaborate with AI platform teams and hardware vendors to influence technology direction. Key skills include extensive experience in cloud infrastructure and distributed systems, mastery of Clos/spine-leaf architectures, and hands-on knowledge of BGP, EVPN/VXLAN, and kernel-level development. Proficiency in Ansible or Terraform for automation, along with monitoring tools like Prometheus and Grafana, is essential. This role demands expertise in large-scale GPU clusters and hyperscale cloud environments, as well as familiarity with optical networking and high-speed interconnects up to 800G.

Skills

RoCE Ethernet InfiniBand BGP EVPN/VXLAN Terraform Ansible Prometheus Grafana SR-IOV Open Virtual Switch Mellanox Cumulus Linux Palo Alto Netscaler SNMP Syslog CI/CD Kubernetes AWS

What you'll do

  • Build and mentor a specialized team of network architects for high-performance GPU infrastructure.
  • Oversee the design of intra-cluster and inter-cluster connectivity using RoCE, Ethernet-based AI fabrics, and data center interconnects.
  • Drive technical tuning to reduce latency and increase throughput while implementing congestion control strategies.
  • Define networking roadmaps that support gaming, AI/ML training, and real-time inference at scale.
  • Engage with ISPs to optimize low-latency edge networks for seamless client connections from data centers.
  • Implement Infrastructure as Code (IaC) and observability frameworks for automated provisioning and health monitoring.
  • Lead incident response and root cause analysis for complex network issues in cloud gaming infrastructure.

What we're looking for

  • Over 12 years of experience in networking, cloud infrastructure, or distributed systems with at least 5 years managing technical teams.
  • Expertise in data center networking including Clos/spine-leaf architectures and high-performance fabrics like RDMA, RoCE, or InfiniBand.
  • Hands-on experience with BGP, EVPN/VXLAN, kernel-level development for routing and switching, and infrastructure automation tools like Ansible or Terraform.
  • Bachelor’s or Master’s degree in Computer Science or a related engineering field (or equivalent experience).
  • Proven success managing networking for large-scale GPU clusters or hyperscale cloud environments.
  • Familiarity with optical networking and high-speed interconnects reaching 100G, 400G, or 800G.

Market check

Salary context

This $256,000–$414,000 range sits above 95% of similar postings on FindRole.

Peer median band

$168,000$257,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$178,892$239,025

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 802 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 798 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Manager, Software Development - GPU Accelerated Storage

Nvidia

Us, Ca, Santa Clara, US 11 days ago $248,000$391,000
NVMe CUDA C++ Python PyTorch JAX GPU DPU FPGA Linux NVLink RDMA Kubernetes CI/CD PostgreSQL VectorDB Key-value storage File systems Object storage systems dmabuf CUDA programming

Senior Staff Engineer, GPU Software Architecture

Samsung Electronics

Remote (3900 N Capital Of Texas Hwy, Austin, Tx, Usa, US) 84 days ago $180,200$297,200
C C++ Python Vulkan DirectX Metal HLSL GLSL OpenCL CUDA Unreal Unity Linux Android OpenGL 3D graphics GPU hardware ray tracing rasterization linear algebra multi-threaded debugging performance profiling parallel programming game engines offline compiler JIT compiler
Remote

Senior Solutions Architect, NVIDIA Cloud Partners

Nvidia

Us, Ca, Santa Clara, US 43 days ago $184,000$287,500
NVIDIA GPU GenerativeAI LLMs NCCL DCGM UFM MissionControl BaseCommandManager Kubernetes Slurm CI/CD Python PostgreSQL AWS Azure Grafana Prometheus Docker Terraform

Senior Solutions Architect, NVIDIA Cloud Partners

Nvidia

Remote (Us, Ca, Santa Clara, US) 43 days ago $184,000$287,500
NVIDIA AWS Azure GCP Python PyTorch TensorFlow NVIDIA_Nemotron NVIDIA_NeMo_Framework NVIDIA_Dynamo NVIDIA_NeMo_Retriever NVIDIA_Triton_Inference_Server TensorRT TensorRT-LLM CUDA-X NCCL DCGM UFM Mission_Control Base_Command_Manager SLURM K8s MLOps
Remote

Principal Solutions Architect - GPU Cloud Network Infrastructure

Nvidia

Remote (Us, Ca, Santa Clara, US) 50 days ago $272,000$431,250
TCP/IP BGP DNS HTTP/2 QUIC High-performance networking Cloud networking services Multi-region architectures IP transit Internet peering technologies Data center networking Internet routing Traffic shaping Edge computing concepts CDN SRE CI/CD Kubernetes Terraform AWS Azure GCP
Remote