Machine Learning Engineer, ML/GenAI Evaluation

Apple Inc

Quick summary

Work type
On-site
Location
San Diego, CA
Salary
$171,600–$302,200 / yr
Posted
4 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $222k
This role $237k
$156k most similar roles pay here $318k

This role pays more than 66% of similar roles. Most pay $195,000–$249,750 — the shaded band above. At the midpoint, this role pays about $237k versus about $222k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 1777 open roles on FindRole.

Listed pay typically runs $162,500–$272,100 across 1443 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Machine Learning Engineer, ML/GenAI Evaluation

As a Machine Learning Engineer specializing in Evaluation at the Wallet, Payments, and Commerce team, you will play a pivotal role in establishing evaluation criteria and metrics frameworks that ensure ML models meet stringent quality standards before reaching hundreds of millions of users. Your daily responsibilities include designing adversarial test strategies to identify model failure modes early on, developing robustness testing methodologies, and owning the fairness evaluation process end-to-end. You will work with Python and tools like MLflow or W&B to build structured test sets that reflect diverse real-world scenarios, ensuring models are reliable across various user demographics and financial contexts. This role requires a deep understanding of model evaluation techniques beyond basic accuracy metrics, including distribution shift testing and adversarial input handling, making you the guardian of model quality and reliability in high-stakes financial applications.

What you'll do

  • Define evaluation criteria and quality metrics for ML models powering Wallet features
  • Design adversarial test strategies to surface model failure modes before they reach users
  • Develop robustness testing methodologies including distribution shift and out-of-distribution generalization
  • Own end-to-end fairness evaluation, defining metrics and building bias test suites across user populations
  • Synthesize evaluation results into actionable insights guiding model development priorities and product decisions
  • Establish and maintain structured test sets reflecting the diversity of Apple's global user base

What we're looking for

  • M.S. in Machine Learning, Computer Science, Statistics, Applied Mathematics, or related field preferred; 7+ years hands-on ML experience required.
  • Deep expertise in model evaluation, offline metrics design, and behavioral testing for production systems.
  • Strong track record designing evaluation frameworks beyond standard accuracy/F1 metrics to include fairness, calibration, and task-specific quality dimensions.
  • Proven ability to construct adversarial test suites and edge-case corpora that surface model failure modes before deployment.
  • Experience with Python programming, evaluation tooling, data pipelines, and experiment tracking (e.g., MLflow).
  • Excellent communication skills for translating metric results into product-quality narratives for diverse audiences.

More like this

Similar roles

Machine Learning Engineer, ML/GenAI Evaluation

Apple Inc

Austin, TX 4 days ago
Python MLflow W&B Bayesian Causal Graphs Counterfactual Fairness Structural Causal Models Confidence Calibration Uncertainty Quantification AWS Kubernetes PostgreSQL CI/CD

Machine Learning Engineer, ML/GenAI Evaluation

Apple Inc

New York City, NY 4 days ago $181,100$318,400
Python MLflow W&B Bayesian Causal graphs Confidence calibration Uncertainty quantification Fairness metrics Evaluation methodologies Adversarial testing Distribution shift Temporal drift OCR pipelines Financial data extraction Machine Learning Computer Science Statistics Applied Mathematics

Machine Learning Research Engineer

Anduril Industries

Washington, District of Columbia 13 days ago $220,000$292,000
Python PyTorch Transformer architectures Edge computing Deep learning CI/CD MLOps

Machine Learning Engineer

Adobe

San Jose 93 days ago $161,700$234,150
Python TensorFlow PyTorch scikit-learn SparkML Kubernetes AWS CI/CD SQL Docker PostgreSQL MLOps

Machine Learning Engineer

Motorola Solutions

Los Angeles, CA 65 days ago $120,000$160,000
Python TensorFlow PyTorch scikit-learn MATLAB C++ signal processing wireless communication MIMO OFDM SDRs GPU acceleration embedded machine learning real-time systems adaptive modulation beamforming cognitive radio techniques 3GPP IEEE 802.11/15 military waveforms
Hybrid