Machine Learning Safety: Evaluation Research Engineer

Apple Inc

Quick summary

Work type
On-site
Location
Seattle, WA
Salary
$171,600–$302,200 / yr
Posted
56 days ago

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $224k
This role $237k
$156k most similar roles pay here $318k

This role pays more than 65% of similar roles. Most pay $197,925–$249,750 — the shaded band above. At the midpoint, this role pays about $237k versus about $224k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 1723 open roles on FindRole.

Listed pay typically runs $162,500–$272,100 across 1398 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Machine Learning Safety: Evaluation Research Engineer

As a Machine Learning Safety Evaluation Research Engineer at the intersection of machine learning and AI ethics, you will join a dedicated team focused on developing safety evaluation methodologies for global media products. Your primary responsibilities include designing comprehensive taxonomies, creating culturally grounded exemplar datasets, translating policies into data requirements, and validating automated judge models to ensure compliance with responsible AI guidelines. You will also develop scalable analysis pipelines, create detailed documentation, and collaborate closely with multi-lingual experts to standardize evaluation processes across diverse markets. Ideal candidates have a strong background in taxonomy design, annotation methodology, and experience working in industry settings, along with advanced degrees in relevant fields such as linguistics or computational social science.

What you'll do

  • Develop and maintain safety-relevant taxonomies for risk categories and content types.
  • Create culturally appropriate exemplar datasets to illustrate taxonomy categories and edge cases.
  • Design and validate automated judge models for scoring AI system outputs for safety compliance.
  • Build scalable automation pipelines for analysis, reducing manual effort in cross-market assessments.
  • Author canonical evaluation guidelines adaptable across languages and markets with multi-lingual support.

What we're looking for

  • 4+ years of applied research experience in evaluation design, AI ethics, Responsible AI, or related fields.
  • Strong understanding of taxonomy design, classification systems, and annotation methodology.
  • Experience developing evaluation guidelines and exemplar sets for human tasks.
  • Ability to collaborate with subject matter experts across languages and cultures.
  • Independent work ethic with strong time management skills.
  • Advanced degree in Linguistics, Information Science, Computational Social Science, or related field.
  • Industry experience working on multilingual or cross-cultural projects.

More like this

Similar roles

Machine Learning Safety: Evaluation Research Engineer

Apple Inc

San Francisco, CA 56 days ago $181,100$318,400
Python SQL Terraform Git Jupyter CI/CD Docker Kubernetes Prometheus Grafana AWS Google Cloud Platform Azure PostgreSQL MongoDB GitHub Confluence Jira Scrum Agile TensorFlow PyTorch

Machine Learning Research Engineer

Booz Allen Hamilton

Springfield, VA 57 days ago $99,000$225,000
PyTorch Transformer-based models Self-supervised learning Multi-task learning Docker CI/CD Python Git Jupyter Notebook TensorBoard Uncertainty estimation Conformal prediction OOD detection Hyperspectral data Masked autoencoders Contrastive learning Retrieval models Multimodal alignment

Machine Learning Research Engineer

Booz Allen Hamilton

Springfield, VA 10 days ago $99,000$225,000
PyTorch Transformer-based models Self-supervised learning Multi-task learning Docker CI/CD Python PostgreSQL Git GitHub Jupyter Notebook TensorFlow Kubernetes AWS Google Cloud Platform Azure Machine Learning Hyperspectral data Uncertainty estimation Conformal prediction OOD detection Masked autoencoders Contrastive learning Retrieval models Multimodal alignment

Machine Learning Research Engineer

Anduril Industries

Washington, District of Columbia 8 days ago $220,000$292,000
Python PyTorch Transformer architectures Edge computing Deep learning CI/CD MLOps

Machine Learning Engineer

Adobe

San Jose 78 days ago $183,300$265,350
Python PyTorch LangChain LangGraph MCP ADK LLMs VLLMs CI/CD Docker AWS PostgreSQL Kubernetes

Machine Learning Engineer

Adobe

San Jose 88 days ago $161,700$234,150
Python TensorFlow PyTorch scikit-learn SparkML Kubernetes AWS CI/CD SQL Docker PostgreSQL MLOps