Senior Inference Engineer, AIConfigurator for Dynamo

Nvidia

Remote

Quick summary

Work type
Remote
Location
Santa Clara, CA
Salary
$184,000–$287,500 / yr
Posted
5 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $206k
This role $236k
$153k most similar roles pay here $302k

This role pays more than 72% of similar roles. Most pay $167,449–$245,112 — the shaded band above. At the midpoint, this role pays about $236k versus about $206k for comparable roles.

Based on 240 similar postings.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 980 open roles on FindRole.

Listed pay typically runs $168,000–$270,250 across 966 roles with salary data.

Most-posted roles

View all roles at Nvidia

At a glance

TL;DR · Senior Inference Engineer, AIConfigurator for Dynamo

NVIDIA is seeking a Senior Inference Engineer to join the AIConfigurator team and enhance its system for discovering high-performance deployment configurations for large-scale LLM inference. The role involves building and evolving the core optimization engine, creating production-quality APIs and SDKs in Python and Rust, and developing backend-specific artifacts for various NVIDIA platforms. Engineers will collaborate with multiple teams to ensure simulated performance matches real-world deployments on GPUs like H100 and H200, while also improving model support through integration of profiling data and validation tools. Ideal candidates have extensive experience in GPU computing, distributed systems, and ML infrastructure, along with strong Python/Rust skills and a deep understanding of LLM inference concepts such as batching and parallelism strategies.

What you'll do

  • Build and evolve AIConfigurator's core optimization engine for LLM serving.
  • Develop Python/Rust APIs and CLIs to help users generate strong deployment configurations.
  • Emit backend-specific artifacts for Dynamo, Kubernetes, TensorRT-LLM, vLLM, and SGLang deployments.
  • Ensure simulated results match actual deployment performance on NVIDIA platforms.
  • Improve model, hardware, and backend support by integrating various databases and tools.
  • Convert complex inference ideas into reliable software abstractions.

What we're looking for

  • 10+ years of relevant software engineering experience in production-quality Python/Rust development.
  • Strong background in GPU computing and distributed systems for high-performance model serving.
  • Deep understanding of LLM inference concepts including batching, latency, efficiency, and parallelism strategies.
  • Experience with data-driven performance analysis, benchmarking, simulation, and optimization.
  • Practical knowledge working directly with TensorRT-LLM, vLLM, SGLang, Triton Inference Server, or comparable platforms.
  • Ability to collaborate across research, runtime, platform, and customer-facing engineering teams.

More like this

Similar roles

Senior Software Engineer - AI Inference

Nvidia

Remote (Santa Clara, CA) 64 days ago $152,000$241,500
Python C++ CUDA vLLM SGLang PyTorch Triton NCCL Dynamo CI/CD GPU InfiniBand Profiling Flamegraphs Microbenchmarks Concurrency Multi-threading Multi-process Kubernetes Docker PostgreSQL
Remote

Senior Software Engineer, AI Inference Systems

Nvidia

Santa Clara, CA 50 days ago $184,000$287,500
Python C/C++ CUDA Kubernetes Docker Triton PyTorch vLLM SGLang MLIR Linux Go Rust CI/CD AWS GCP Azure Prometheus Grafana GitHub MLOps
Hybrid

Senior AI Machine Learning Engineer

The Hartford

Chicago, IL +2 29 days ago $117,200$175,800
AWS GCP SageMaker Streamlit Python Java C# Hadoop Spark Redshift Snowflake BigQuery Jenkins Terraform GitHub GitHub Actions Apache Airflow Kubernetes Docker SQL CI/CD MLOps
Hybrid

Senior Machine Learning Engineer, AI Platform

Adobe

San Jose 36 days ago $211,800$306,625
Python Java C++ Cloud Infrastructure Distributed Computing Deep Learning Virtual Reality Augmented Reality Artificial Intelligence Robotics Interactive Experiences Large-Scale Computing Frameworks Data Analysis Systems Modeling Environments

Senior Machine Learning Engineer (AI Foundations)

Capital One Financial

McLean, VA +1 8 days ago $161,800$184,600
Python Scala Java scikit-learn PyTorch Dask Spark TensorFlow Kubernetes AWS CI/CD PostgreSQL Redis Git Jupyter Notebook S3 Snowflake Hadoop Docker