ML Engineer - Automated Evaluation and Adversarial Design

Apple Inc

Quick summary

Work type: On-site
Location: San Diego, CA
Salary: $139,500–$258,100 / yr
Posted: 44 days ago

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $215k

This role $199k

$125k most similar roles pay here $272k

This role pays less than 60% of similar roles. Most pay $180,327–$249,750 — the shaded band above. At the midpoint, this role pays about $199k versus about $215k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 638 open roles on FindRole.

Listed pay typically runs $171,600–$272,100 across 505 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · ML Engineer - Automated Evaluation and Adversarial Design

Apply Now Log in to save

As a Senior ML Engineer on the Automated Evaluation and Adversarial Design team, you will focus on building and scaling automated evaluation systems to assess AI feature quality at scale, including multi-turn conversation evaluations and end-to-end agent workflow testing. Your day-to-day responsibilities include designing adversarial test suites that probe model weaknesses and executing stress tests under demanding conditions to ensure features meet performance thresholds for hundreds of millions of users. You will develop evaluation frameworks and rubrics, generate adversarial test case libraries, and create multi-turn stress-test pipelines while ensuring alignment between automated and human evaluation methods. The role requires expertise in Python, ML frameworks like PyTorch or TensorFlow, and familiarity with productivity software and creative tools to assess output quality from a user workflow perspective. Additionally, experience with agent orchestration frameworks and observability tooling is beneficial for evaluating multi-step agent runs effectively.

Skills

Python PyTorch TensorFlow ML frameworks CI/CD LangChain LangGraph CrewAI AutoGen LangSmith Braintrust Arize Kubernetes Docker AWS GCP Azure

What you'll do

Define and own automated evaluation approaches for AI features across single-turn and multi-turn interactions.
Build adversarial test suites targeting known and emerging model failure modes, including edge cases in productivity workflows.
Develop stress test protocols to validate performance under atypical conditions, such as extended conversation lengths and complex sequences.
Ensure alignment between automated and human evaluation methods by identifying and resolving systematic disagreements.
Scale adversarial test case generation and stress test execution using automation for multi-turn scenarios and agent interactions.

What we're looking for

Bachelor’s degree in Computer Science, Machine Learning, Statistics, or related field
4+ years of experience building ML evaluation systems and designing evaluation benchmarks for sequential AI outputs
Experience independently defining evaluation architecture and methodology for AI systems with multi-turn interaction focus
Expertise in designing adversarial test methodologies targeting failures across multi-turn interactions
Proficiency with Python and ML frameworks (PyTorch, TensorFlow) in production settings

Similar roles

ML Engineer - Automated Evaluation and Adversarial Design

Apple Inc

Seattle, WA 44 days ago $139,500–$258,100

Python PyTorch TensorFlow CI/CD Kubernetes Docker AWS GCP Azure PostgreSQL LangChain LangGraph CrewAI AutoGen LangSmith Braintrust Arize

Save

ML Engineer - Automated Evaluation and Adversarial Design

Apple Inc

Culver City, CA 44 days ago $139,500–$258,100

Python PyTorch TensorFlow ML frameworks CI/CD Kubernetes AWS Docker Prometheus Grafana LangChain LangGraph CrewAI AutoGen LangSmith Braintrust Arize

Save

ML Engineer - Automated Evaluation and Adversarial Design

Apple Inc

Cupertino, CA 44 days ago $147,400–$272,100

Python PyTorch TensorFlow ML frameworks CI/CD Kubernetes Docker AWS GCP Azure PostgreSQL LangChain LangGraph CrewAI AutoGen LangSmith Braintrust Arize Prometheus Grafana

Save

AI/ML Engineer

Lam Research

Fremont, CA 64 days ago $119,000–$261,000

Python C++ PostgreSQL SQLite MySQL Git Domain-Driven Design Test-Driven Development CI/CD

Hybrid

Save

AI/ML Engineer

Booz Allen Hamilton

Norfolk, VA 3 days ago

Spark Hadoop Databricks Python Java Scala R TensorFlow Keras PyTorch CI/CD MLOps Git Jupyter Notebook PostgreSQL MongoDB AWS Azure Google Cloud Platform Kubernetes Docker

Save

Machine Learning Engineer

Adobe

San Jose 2 days ago $161,700–$234,150

Python AWS GCP Azure MLOps CI/CD Docker Kubernetes Prometheus Terraform PostgreSQL Git Agentic systems Multi-agent orchestration LLM-as-a-judge Retrieval-Augmented Generation RAG NLP pipelines

Save