Solutions Architect, AI Factory Infrastructure DevOps

Nvidia

Remote

Quick summary

Work type
Remote
Location
Austin, TX
Salary
$152,000–$241,500 / yr
Posted
1 day ago

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $197k
This role $197k
$141k most similar roles pay here $252k

This role pays less than 52% of similar roles. Most pay $153,300–$240,575 — the shaded band above. At the midpoint, this role pays about $197k versus about $197k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 980 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 966 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Solutions Architect, AI Factory Infrastructure DevOps

Join our AI Factory infrastructure deployment team as a Solutions Architect, where you will play a pivotal role in deploying advanced NVIDIA GPU products across data centers and edge computing environments. You will architect high-performance distributed AI infrastructures, support sales teams with technical expertise on GPU and networking solutions, and establish strong relationships with customer engineers and management. Your responsibilities include identifying key product requirements, providing on-site support for hardware and software issues, leading the lifecycle of NVIDIA’s products from design to end-of-life, and offering training to direct sales teams and channel partners. This role requires proficiency in Kubernetes, Slurm, Docker, AI tools like Claude and Codex, as well as experience with Redfish, Grafana, and Prometheus. Ideal candidates have a background in engineering or computer science, extensive experience in NCP, CSP, site reliability, and virtualization technologies, and hands-on problem-solving skills within customer infrastructures.

What you'll do

  • Help architect and scale AI infrastructure using the latest NVIDIA GPU supercomputers.
  • Directly support sales teams to secure customer wins by showcasing GPU and networking products.
  • Identify key product requirements for CSP/OEM AI market to implement NVIDIA solutions efficiently.
  • Provide on-site technical support to solve hardware and software issues in deep learning inference.
  • Lead product lifecycle from design-in to end-of-life, ensuring detailed execution and customer satisfaction.
  • Maintain infrastructure components at the customer site and collect findings for improvement.

What we're looking for

  • 5+ years of experience in high-tech IT companies with expertise in NCP, CSP, site reliability, and virtualization technologies.
  • Proficient in Kubernetes, Slurm, Docker, AI tools, Redfish, Grafana, and Prometheus.
  • Strong problem-solving skills for customer infrastructure issues.
  • BS or MS in Engineering, Electrical Engineering, Physics, or Computer Science (or equivalent experience).
  • Excellent communication skills with both management and engineering teams.
  • Ability to handle multiple initiatives and priorities effectively.

More like this

Similar roles

Senior Solutions Architect - AI Factory Deployment

Nvidia

Remote (Austin, TX) +2 48 days ago $184,000$287,500
Linux Python Shell NCCL AllReduce AllToAll PyTorch TensorFlow Bash Benchmarking Metrics Messaging_Systems Logging Tracing CI/CD HPC GPU_Clusters
Remote

AI Factory CPU focused Solutions Architect

Nvidia

Remote (Santa Clara, CA) 56 days ago $184,000$287,500
NVIDIA HPC AI MLOps CPU GPU Networking Arm-based processors Automation Performance testing Reinforcement learning CI/CD Docker Kubernetes Terraform Python PostgreSQL Prometheus Grafana
Remote

Senior Solution Architect, AI Infrastructure

Nvidia

Remote (Us, Dc, Remote, US) 36 days ago $184,000$287,500
NVIDIA_GPUs NVIDIA_Networking InfiniBand Ethernet NCCL DCGM UFM Mission_Control Base_Command_Manager AI_solutions High_Performance_Computing Networking Python CI/CD Git AWS Azure Grafana Prometheus
Remote