Senior Software Engineer - NIM Factory Container and Cloud Infrastructure

Nvidia

Actively hiring
Remote (Us, Ca, Santa Clara, US) Posted 50 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

As a Senior Software Engineer on NVIDIA’s team focused on container and cloud infrastructure, you will play a pivotal role in shaping the core container strategy for NVIDIA Inference Microservices (NIMs) by designing and implementing robust software and tooling. Your daily tasks include building enterprise-grade containers with reproducible multi-arch CUDA optimizations, developing Python-based CI/CD integrations, and enhancing Kubernetes deployment patterns to ensure optimal GPU utilization across thousands of GPUs. You will also collaborate closely with research, backend, SRE, and product teams to maintain high engineering standards for container quality, security, and operability. Ideal candidates possess extensive experience in building production software with a focus on containers and Kubernetes, strong Python skills, and deep expertise in Docker/BuildKit, Kubernetes operations, and GPU workload management.

Skills

Python Kubernetes Docker Helm Operators CI/CD OpenAI_API Hugging_Face_API vLLM SGLang TRT-LLM GPU NVIDIA_Device_Plugin MIG CUDA_Drivers/Runtime BuildKit containerd OCI Prometheus Grafana

What you'll do

  • Design and harden containers for NIM runtimes and inference backends.
  • Develop Python tooling and services for build orchestration and CI/CD integrations.
  • Optimize Kubernetes deployment patterns for GPU scheduling and autoscaling.
  • Enhance container performance through layer layout, startup time optimization, and caching.
  • Evolve base image strategy and dependency management for artifact registries.
  • Mentor teammates to uphold high engineering standards for container quality.

What we're looking for

  • 6+ years of production software development with a focus on containers and Kubernetes.
  • Strong Python skills for building production-grade tooling and services.
  • Expert knowledge in Docker/BuildKit, containerd/OCI, image layering, multi-stage builds, and registry workflows.
  • Deep experience operating workloads on Kubernetes, including GPU scheduling and autoscaling.
  • Hands-on experience with NVIDIA device plugin, MIG, CUDA drivers/runtime, and resource isolation for GPU workloads.
  • Excellent collaboration and communication skills to influence cross-functional design.

Market check

Salary context

This $184,000–$287,500 range sits above 87% of similar postings on FindRole.

Peer median band

$119,782$215,500

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$142,400$217,725

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Software Engineer - NIM Platform SDK and Framework

Nvidia

Remote (Us, Ca, Santa Clara, US) 59 days ago $184,000$287,500
Python Rust Kubernetes Docker Helm vLLM TensorRT-LLM SGLang NVIDIA Dynamo AWS GCP Azure OCI CI/CD API gateways Service mesh Load balancing Distributed systems High-performance data pipelines Parallel I/O Caching strategies Integrity verification Containerized application delivery Application security principles Secure coding practices Vulnerability mitigation Secrets management Supply chain integrity Test-driven development Code review Cross-team collaboration
Remote

Senior Software Engineer - Cloud and Kubernetes

Nvidia

Remote (Us, Ca, Santa Clara, US) 28 days ago $184,000$287,500
Kubernetes Go C++ Rust CI/CD Jenkins GitLab GitHub Docker Prometheus Grafana Python PostgreSQL NVIDIA GPUs ConnectX BlueField NICs HPC AI Networking
Remote

Senior Software Engineer, Cloud

Abbott

Remote (United States Of America : Remote, US) 30 days ago $86,700$173,300
Go SQL Server Postgres RESTful APIs microservices Kubernetes Docker Linux CI/CD JIRA Confluence Python AWS Git Terraform Prometheus Grafana
Remote

Senior Software Engineer, Cloud

Abbott

US 30 days ago $99,300$198,700
Go SQL Server PostgreSQL RESTful APIs microservices Kubernetes Docker TDD CI/CD Linux Open Telemetry pprof

Senior Software Engineer, Cloud

Abbott

US 30 days ago $86,700$173,300
Go SQL Server PostgreSQL Kubernetes Docker RESTful APIs microservices Linux CI/CD Agile Confluence JIRA

Senior Software Engineer, Developer Tools for Cloud

Nvidia

Remote (Us, Wa, Redmond, US) 9 days ago $152,000$241,500
Python JavaScript C++ Kubernetes GraphQL Go Rust Datadog ClickHouse Grafana CUDA HPC Networking Performance Optimization Microservices Web APIs Distributed Environments Algorithms Computer Architecture
Remote