Engineering Manager, AI Observability

Netflix

Actively hiring
Remote (Los Gatos, US) Posted 57 days ago $523,000$920,000 / year

At a glance

AI generated

TL;DR

Join the Artificial Intelligence Platform (AIP) organization as an experienced engineering leader to build and lead the next generation of Netflix’s AI observability platform, enabling ML practitioners across diverse domains to collect model inputs, features, and predictions for thousands of large-scale models. You will collaborate with ML researchers, engineers, and platform teams to embed “observability-by-default” into new AI services, driving end-to-end observability strategies for LLMs, generative AI systems, and classical ML models. With a strong background in AI infrastructure and experience leading high-traffic distributed system builds, you will define and execute a platform roadmap focused on incremental delivery, ensuring clear success metrics and adoption across teams. This role requires deep familiarity with AI operations, including model evaluation and continuous monitoring at scale, as well as exposure to LLMs and generative AI systems. Strong technical acumen, communication skills, and the ability to mentor a high-performing engineering team are essential for this highly collaborative environment.

Skills

Arize_AI Fiddler_AI Weights_and_Biases Vertex_AIModelMonitoring SageMakerModelMonitor LLM_evaluation_frameworks Prompt_instrumentation Response_quality_measurement Human_in_the_loop_review AI_Observability ML_operations Drift_detection Continuous_monitoring Distributed_systems ML_infrastructure Cloud_services Python Kubernetes Terraform CI/CD

What you'll do

  • Lead end-to-end observability strategy for AI workloads, including LLMs and classical ML models.
  • Embed "observability-by-default" into new AI services to ensure built-in telemetry and monitoring.
  • Define and execute a platform roadmap with clear success metrics and migration goals.
  • Drive the evolution of LLM evaluation frameworks, covering prompt instrumentation and response quality measurement.
  • Hire, grow, and mentor a high-performing engineering team focused on AI observability.
  • Communicate progress to stakeholders, customers, and senior leadership effectively.

What we're looking for

  • 10+ years of software engineering experience and 3+ years of management experience.
  • Deep expertise in building high-traffic distributed systems and ML infrastructure.
  • Extensive knowledge of AI/ML operations including model evaluation and monitoring at scale.
  • Experience with AI observability tools such as Arize AI, Fiddler AI, and Vertex AI Model Monitoring.
  • Familiarity with LLMs and generative AI systems, including prompt/result logging and evaluation metrics.
  • Strong technical leadership skills to mentor and guide a high-performing engineering team.
  • Proven ability to develop and execute a technical vision and roadmap for complex projects.

Market check

Salary context

This $523,000–$920,000 range sits above 100% of similar postings on FindRole.

Peer median band

$170,000$257,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$169,625$246,150

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Netflix

Netflix is the world''s leading streaming entertainment service, offering a vast library of TV series, films, documentaries, and original content to subscribers in over 190 countries. Industry: Streaming Entertainment & Media

Netflix currently has 91 open roles on FindRole.

Listed pay typically runs $388,000–$610,000 across 87 roles with salary data.

Most-posted roles

View all roles at Netflix

More like this

Similar roles

Senior Engineering Manager, AI

Cisco

Remote (Usa-Seattle, US) 9 days ago $228,400$289,200
Python Java Kubernetes Docker CI/CD Terraform AWS Azure Git GitHub Jenkins PostgreSQL MongoDB TensorFlow PyTorch MLflow Prometheus Grafana
Remote

Sr Engineering Manager, AI Enablement

Adobe

San Jose, US 17 days ago $221,000$320,000
AWS Azure GCP Agentic AI MCP A2A LLMs RAG Prompt Orchestration Distributed Systems APIs Cloud Environments CI/CD

Engineering Manager, AI Developer Technology

Nvidia

Us, Ca, Santa Clara, US 72 days ago $224,000$356,500
CUDA C/C++ Python GPU CPU MPI OpenMP pthread Deep Learning Machine Learning LLMs Multimodal Models Linear Algebra Parallel Programming Algorithm Optimization Software Development Presentation Skills Technical Leadership Recruiting Top Talent Engineering Excellence

Principal Applied AI Engineering Manager

Microsoft

Redmond, Wa,Us, US 9 days ago
Python C# Azure CI/CD LLM-based systems RAG architectures microservices containers TDD staged rollouts full observability Prometheus Grafana responsible AI PII protection audit trails low-code/no-code platforms chatbots

CTIO AI Engineering Manager

PWC

New York - 300 Madison Avenue, US 56 days ago $73,500$212,280
Docker Kubernetes AWS GitHub Actions CI/CD Python Vector databases LangChain Machine learning models Scalable microservices Cloud-native applications AI systems Open-source contributions GCP Azure PostgreSQL MLOps

AI/ML OPs Principal Engineer

Blackline

Pleasanton, California, US 105 days ago $257,000$257,000
Python TensorFlow PyTorch MLflow Kubeflow GCP AWS Azure LangChain CI/CD Docker Kubernetes Prometheus Grafana Apache Airflow DevSecOps IaC Responsible AI DevOps