Senior System Software Engineer - AI Performance and Efficiency Tools

Nvidia

Hybrid Actively hiring
Santa Clara, US Posted 23 days ago $184,000$287,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA’s dynamic software development team as a senior software engineer, where you will play a pivotal role in creating advanced profiling, analysis, and debugging tools for AI workloads. Your daily tasks include building internal tools that provide intuitive insights into system performance and workload efficiency, collaborating with architecture teams to enhance hardware and software features, and addressing complex issues related to memory and networking. You’ll need strong C++ and Python skills, expertise in deep learning frameworks like PyTorch and TensorFlow, and knowledge of GPU cluster job scheduling systems such as Slurm or Kubernetes. Additionally, experience with NVIDIA GPUs, CUDA programming, and NCCL is essential, along with a passion for continuous learning and the ability to work effectively across multiple global teams on large-scale AI projects.

Skills

Python C++ PyTorch TensorFlow Kubernetes Slurm CUDA NCCL NVIDIA_GPUs Linux_device_drivers Compiler_implementation GPU_architecture CPU_architecture Computer_architecture_principles CI/CD

What you'll do

  • Build internal profiling and analysis tools for large-scale AI workloads.
  • Develop debugging tools to address common issues like memory and networking problems.
  • Create benchmarking technologies for AI systems or GPU clusters.
  • Collaborate with hardware architects to propose new features based on real-world use cases.
  • Analyze performance of large AI jobs during training and inference phases.

What we're looking for

  • BS+ in Computer Science or equivalent with 6+ years of software development experience.
  • Strong skills in C++, Python, debugging, and analysis for AI workloads.
  • Deep understanding of PyTorch, TensorFlow, distributed training, and inference.
  • Experience with NVIDIA GPUs, CUDA programming, NCCL, and GPU cluster scheduling.
  • Proven ability to build profiling and analysis tools at scale for AI systems.
  • Knowledge of Linux device drivers, compiler implementation, and computer architecture.

Market check

Salary context

This $184,000–$287,500 range sits above 86% of similar postings on FindRole.

Peer median band

$142,450$234,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$146,500$219,765

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Senior Software Engineer - AI Research Clusters

Nvidia

Remote (Us, Ca, Santa Clara, US) 29 days ago $152,000$241,500
Python Kubernetes Docker GitLab CI C++ Rust RSQL REST API JavaScript CSS Slurm Linux GPU Computing AIOps Agentic AI CI/CD Prometheus Grafana
Remote

Senior Software Engineer - AI Applications

Plaid

San Francisco Hq, US 42 days ago $209,880$289,080
HTML CSS JavaScript LLM GenAI SSE Vector_Databases Embeddings Agent_Orchestration_Frameworks Prompt_Engineering RAG Semantic_Search CI/CD Python Node.js React Docker Kubernetes AWS PostgreSQL

Senior Software Engineer - AI Core Engineering

The Walt Disney Company

Remote (Usa - Ca - 1200 Grand Central Ave, US) 93 days ago $141,900$190,300
Python LLM APIs AWS Bedrock Azure AI Foundry LangChain LangGraph APIs SDKs OpenAI Anthropic Claude Observability Tracing Latency and cost dashboards Drift detection Multi-agent orchestration Synthetic data Enterprise governance Security Compliance Audit Policy enforcement
Remote

Senior Software Engineer (AI Platform)

Smartly

US 42 days ago
Python TypeScript PostgreSQL Node.js Docker Kubernetes React AWS GCP CI/CD MLOps PyTorch TensorFlow MLflow Kubeflow

Senior Software Engineer - Applied AI/ML

Motorola Solutions

Chicago, Il, US 16 days ago $135,000$155,000
Python SQL Docker Kubernetes AWS Azure GCP MLOps CI/CD PyTorch Tensorflow Databricks MLFlow AWS SageMaker Hugging Face Apache Airflow Temporal RF rRay