Machine Learning Engineer - AI & ML Evaluation Frameworks

Apple Inc

Quick summary

Work type: On-site
Location: Cupertino, CA
Salary: $147,400–$272,100 / yr
Posted: 7 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $217k

This role $210k

$132k most similar roles pay here $287k

This role pays more than 50% of similar roles. Most pay $188,537–$246,150 — the shaded band above. At the midpoint, this role pays about $210k versus about $217k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 1777 open roles on FindRole.

Listed pay typically runs $162,500–$272,100 across 1443 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Machine Learning Engineer - AI & ML Evaluation Frameworks

Apply Now Log in to save

As a Machine Learning Engineer at Apple’s Health Sensing Machine Learning Interpretability & Analytics team, you will play a pivotal role in developing scalable evaluation frameworks for both traditional ML systems and foundation models like LLMs. Your responsibilities include designing robust methodologies to assess performance, reliability, and safety, conducting deep-dive evaluations to identify failure points, and building synthetic data pipelines that adhere to Apple’s privacy standards. You will also create tools and metrics to ensure demographic equity across diverse populations and translate evaluation insights into actionable recommendations for algorithm teams. This role requires expertise in Python, proficiency in evaluating supervised, unsupervised, and LLM models, hands-on experience with failure analysis, and strong communication skills to bridge technical and non-technical audiences.

Skills

Python CI/CD Git LLMs Kubernetes Apache Spark Airflow PostgreSQL Terraform AWS Docker Prometheus Grafana

What you'll do

Design robust methodologies to assess the performance of traditional ML and foundation models.
Drive failure analysis to detect reasoning flaws and edge cases in ML systems.
Expand data generation pipelines for model training without exposing real user data.
Build tools to fuse asynchronous time-series signals from various sensors and sources.
Develop metrics to discover biases and measure demographic equity across diverse populations.
Translate evaluation results into actionable insights for engineering teams and clinical experts.

What we're looking for

3+ years of experience in ML Engineering or Applied ML
Strong expertise in evaluating supervised, unsupervised, LLMs, and deep learning models
Proficiency in Python with production-grade coding skills (OOP, CI/CD, Git)
Hands-on experience in failure analysis and driving model improvements
Experience building data pipelines, inference frameworks, and automated evaluation systems
Strong communication skills for technical and non-technical audiences

Similar roles

ML Research Engineer, AI Evaluation Platform

Apple Inc

Seattle, WA 60 days ago $171,600–$258,100

Python PyTorch JAX TensorFlow Docker Kubernetes CI/CD Ray Spark DeepEval Ragas TruLens LangSmith Git PyTest

Save

ML Research Engineer, AI Evaluation Platform

Apple Inc

Seattle, WA 67 days ago $171,600–$258,100

Python PyTorch JAX TensorFlow Docker Kubernetes CI/CD Ray Spark DeepEval Ragas TruLens LangSmith Git PyTest

Save

Machine Learning Engineer (AI Foundations)

Capital One Financial

McLean, VA +1 11 days ago $135,600–$154,800

Python Scala Java scikit-learn PyTorch Dask Spark TensorFlow distributed computing distributed file systems multi-node database paradigms data pipelines CI/CD

Save

Machine Learning Engineer, ML/GenAI Evaluation

Apple Inc

Austin, TX 3 days ago

Python MLflow W&B Bayesian Causal Graphs Counterfactual Fairness Structural Causal Models Confidence Calibration Uncertainty Quantification AWS Kubernetes PostgreSQL CI/CD

Save

Machine Learning Engineer, ML/GenAI Evaluation

Apple Inc

San Diego, CA 3 days ago $171,600–$302,200

Python MLflow W&B Bayesian Causal graphs Confidence calibration Uncertainty quantification OCR pipelines Financial data extraction Fairness evaluation Distribution shift Temporal drift Adversarial testing Evaluation methodologies Structured document understanding Semi-structured document understanding Machine Learning Model evaluation

Save

Machine Learning Engineer, ML/GenAI Evaluation

Apple Inc

New York City, NY 3 days ago $181,100–$318,400

Python MLflow W&B Bayesian Causal graphs Confidence calibration Uncertainty quantification Fairness metrics Evaluation methodologies Adversarial testing Distribution shift Temporal drift OCR pipelines Financial data extraction Machine Learning Computer Science Statistics Applied Mathematics

Save