ML and Agentic Systems Engineer

Nvidia

Actively hiring Verified listing
Santa Clara, US Posted 30 days ago $224,000$356,500 / year

At a glance

AI generated

TL;DR

Join NVIDIA’s Cosmos team as an experienced engineer focused on building agentic systems that enhance the ML lifecycle through automation and intelligence. In this role, you will design and implement workflows for data generation, evaluation, debugging, training orchestration, and iteration, leveraging Python and PyTorch to create robust, modular software. You’ll build AI-native systems where models and agents interact with codebases and experiments to improve productivity, creating self-improving loops that drive better decisions across the system. Key responsibilities include scaling evaluation platforms, integrating open-source components into unified systems for rapid experimentation, and maintaining high standards of engineering excellence through testing, reproducibility, and maintainability. Ideal candidates have extensive experience in ML software platform development, deep Python expertise, and a track record of building impactful developer tooling or workflow automation at scale.

Skills

Python PyTorch ML pipelines Evaluation platforms Developer tooling Workflow automation System design Testing Packaging Debugging LLM-based systems Kubernetes CI/CD PostgreSQL Git Docker AWS GCP

What you'll do

  • Design and implement agentic workflows across the ML lifecycle.
  • Build AI-native systems enabling models and agents to interact with codebases and tools.
  • Create self-improving loops for data generation, failure detection, and output evaluation.
  • Own and evolve large-scale Python and PyTorch codebases for robust software development.
  • Design and scale evaluation platforms combining automated metrics and human feedback.
  • Build multimodal ML pipelines covering data processing, experimentation, benchmarking, and deployment.

What we're looking for

  • Significant experience building machine learning systems and software platforms.
  • Expert-level Python skills with strong judgment on code modularity and long-term health.
  • Deep familiarity with PyTorch for debugging, adapting, and extending model behavior.
  • Experience in building ML pipelines, evaluation systems, and developer tooling at scale.
  • Strong software engineering fundamentals including testing, packaging, and collaborative practices.
  • Background in building agent-based systems that perform real-world tasks efficiently.
  • BS, MS, or equivalent experience in Computer Science or related field with 12+ years of relevant development.

Market check

Salary context

This $224,000–$356,500 range sits above 93% of similar postings on FindRole.

Peer median band

$169,500$249,600

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$162,000$247,562

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Nvidia

Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing

Nvidia currently has 801 open roles on FindRole.

Listed pay typically runs $184,000–$287,500 across 797 roles with salary data.

Most-posted roles

View all roles at Nvidia

More like this

Similar roles

Director, Director, ML Engineering & Agentic Systems

PayPal

Usa - California - San Jose - Corp - N First St, US 23 days ago $242,000$359,150
Python Java ML/AI Kubernetes Docker AWS CI/CD LLM-based systems MCP/ACP protocols Feature stores Model serving Experiment frameworks PostgreSQL MySQL Fintech Payments Compliance MFA Transaction systems

Agentic AI Systems Engineer

Applied Materials

Locations Santa Clara, California, US 24 days ago $152,000$208,500
Java Python C++ Kubernetes Terraform CI/CD Prometheus Grafana MVP LLM VectorSearch RAG SecurityFirst HighReliabilitySoftware

Machine Learning Engineer, Agentic AI

Zillow

Remote (Remote-Usa, US) 77 days ago $145,500$232,500
Python TensorFlow PyTorch LangChain LangGraph Kubernetes Docker CI/CD AWS Azure GCP PostgreSQL MongoDB Prometheus Grafana Git Scalable Architecture Responsible AI Deployment Multi-step Reasoning
Remote

Agentic AI Machine Learning Engineer

Booz Allen Hamilton

US 25 days ago $99,000$225,000
AWS Azure Docker Kubernetes Python LLMs Deep Learning Reinforcement Learning LangChain LangGraph PydanticAI llamaindex Grafana Langfuse LangSmith Phoenix CI/CD MLOps

Agentic AI Machine Learning Engineer

Booz Allen Hamilton

Locations Annapolis Junction, Maryland, US 36 days ago $99,000$225,000
AWS Azure Docker Kubernetes Python LLMs Deep Learning Reinforcement Learning LangChain LangGraph PydanticAI llamaindex Grafana Langfuse LangSmith Phoenix CI/CD MLOps

Agentic AI Machine Learning Engineer

Booz Allen Hamilton

US 24 days ago $99,000$225,000
AWS Azure Docker Kubernetes Python LLMs Deep Learning Reinforcement Learning LangChain LangGraph PydanticAI llamaindex Grafana Langfuse LangSmith Phoenix CI/CD