Machine Learning Safety: Evaluation Research Engineer

Apple Inc

Quick summary

Work type: On-site
Location: Seattle, WA
Salary: $171,600–$302,200 / yr
Posted: 56 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $224k

This role $237k

$156k most similar roles pay here $318k

This role pays more than 65% of similar roles. Most pay $197,925–$249,750 — the shaded band above. At the midpoint, this role pays about $237k versus about $224k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 1723 open roles on FindRole.

Listed pay typically runs $162,500–$272,100 across 1398 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Machine Learning Safety: Evaluation Research Engineer

Apply Now Log in to save

As a Machine Learning Safety Evaluation Research Engineer at the intersection of machine learning and AI ethics, you will join a dedicated team focused on developing safety evaluation methodologies for global media products. Your primary responsibilities include designing comprehensive taxonomies, creating culturally grounded exemplar datasets, translating policies into data requirements, and validating automated judge models to ensure compliance with responsible AI guidelines. You will also develop scalable analysis pipelines, create detailed documentation, and collaborate closely with multi-lingual experts to standardize evaluation processes across diverse markets. Ideal candidates have a strong background in taxonomy design, annotation methodology, and experience working in industry settings, along with advanced degrees in relevant fields such as linguistics or computational social science.

Skills

Python SQL Terraform Git CI/CD Docker Kubernetes AWS Google Cloud Platform Azure PostgreSQL MLOps NLP TensorFlow PyTorch Scikit-learn Jupyter Notebook GitHub Confluence Tableau Prometheus Grafana

What you'll do

Develop and maintain safety-relevant taxonomies for risk categories and content types.
Create culturally appropriate exemplar datasets to illustrate taxonomy categories and edge cases.
Design and validate automated judge models for scoring AI system outputs for safety compliance.
Build scalable automation pipelines for analysis, reducing manual effort in cross-market assessments.
Author canonical evaluation guidelines adaptable across languages and markets with multi-lingual support.

What we're looking for

4+ years of applied research experience in evaluation design, AI ethics, Responsible AI, or related fields.
Strong understanding of taxonomy design, classification systems, and annotation methodology.
Experience developing evaluation guidelines and exemplar sets for human tasks.
Ability to collaborate with subject matter experts across languages and cultures.
Independent work ethic with strong time management skills.
Advanced degree in Linguistics, Information Science, Computational Social Science, or related field.
Industry experience working on multilingual or cross-cultural projects.

Similar roles

Machine Learning Safety: Evaluation Research Engineer

Apple Inc

San Francisco, CA 56 days ago $181,100–$318,400

Python SQL Terraform Git Jupyter CI/CD Docker Kubernetes Prometheus Grafana AWS Google Cloud Platform Azure PostgreSQL MongoDB GitHub Confluence Jira Scrum Agile TensorFlow PyTorch

Save

Machine Learning Research Engineer

Booz Allen Hamilton

Springfield, VA 57 days ago $99,000–$225,000

PyTorch Transformer-based models Self-supervised learning Multi-task learning Docker CI/CD Python Git Jupyter Notebook TensorBoard Uncertainty estimation Conformal prediction OOD detection Hyperspectral data Masked autoencoders Contrastive learning Retrieval models Multimodal alignment

Save

Machine Learning Research Engineer

Booz Allen Hamilton

Springfield, VA 10 days ago $99,000–$225,000

PyTorch Transformer-based models Self-supervised learning Multi-task learning Docker CI/CD Python PostgreSQL Git GitHub Jupyter Notebook TensorFlow Kubernetes AWS Google Cloud Platform Azure Machine Learning Hyperspectral data Uncertainty estimation Conformal prediction OOD detection Masked autoencoders Contrastive learning Retrieval models Multimodal alignment

Save