ML Safety Engineer

Apple Inc

Quick summary

Work type: On-site
Location: San Francisco, CA
Salary: $181,100–$272,100 / yr
Posted: 56 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $196k

This role $227k

$133k most similar roles pay here $287k

This role pays more than 72% of similar roles. Most pay $154,625–$236,900 — the shaded band above. At the midpoint, this role pays about $227k versus about $196k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 1723 open roles on FindRole.

Listed pay typically runs $162,500–$272,100 across 1398 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · ML Safety Engineer

Apply Now Log in to save

As an ML Research Engineer at Apple Services Engineering, you will join a dedicated team focused on ensuring the safe and trustworthy development of AI features across various platforms. Your primary responsibilities include designing automated safety benchmarking methodologies, developing evaluation frameworks for assessing risks posed by media-related agents, and creating scalable tools to help engineering teams improve model performance responsibly. You will work with ML and data scientists, software developers, and project managers to translate complex requirements into practical solutions, ensuring that AI experiences meet high standards of reliability and alignment with user expectations. This role requires strong proficiency in Python (pandas, NumPy, Jupyter, PyTorch), experience with large datasets and model evaluation pipelines, and a deep understanding of responsible AI practices. You will also contribute to the development of benchmark datasets and metrics for multi-turn interactions, applying rigorous statistical methods to ensure reproducibility and relevance in your work.

Skills

Python PyTorch pandas NumPy Jupyter AI/ML models LLMs program synthesis systems RAG systems reinforcement learning agentic architectures model fine-tuning CI/CD large datasets annotation tools hallucination detection model alignment taxonomies categorization schemes structured labeling frameworks

What you'll do

Design scientifically-grounded benchmarking methodologies for AI/ML models.
Develop automated evaluation pipelines to assess model outputs at scale.
Create datasets representing realistic and adversarial use cases across domains.
Define and validate new metrics for complex interaction patterns in AI systems.
Apply statistical rigor to ensure reproducibility of evaluation frameworks.
Translate experimental findings into actionable improvements for safety.

What we're looking for

Advanced degree (MS or PhD) in Computer Science, Software Engineering, or equivalent research/work experience.
1+ years of work experience either as a postdoc or in the industry.
Strong background in empirical evaluation, experimental design, and benchmarking.
Proficiency in Python with libraries like pandas, NumPy, Jupyter, PyTorch.
Experience evaluating AI/ML models, especially LLMs or program synthesis systems.
Deep familiarity with software engineering workflows and developer tools.

Similar roles

ML Safety Engineer

Apple Inc

San Francisco, CA 50 days ago $181,100–$272,100

Save

ML Engineer

McDonald’s Corporation

Chicago, Illinois 25 days ago $129,800–$165,490

Python Java Scala Apache Airflow Luigi Hadoop Spark NoSQL Data质量管理数据产品生命周期管理数据仓库原则 CI/CD Mentorship 跨职能团队协作大数据生态系统数据治理能力数据质量功能数据标准化数据解析去重层级管理

Save

Director of Safety ML

Remote (US) 8 days ago $276,700–$387,400

Python TensorFlow PyTorch Kubernetes AWS LLMs Transformer Models NLP CI/CD MLOps Prometheus PostgreSQL Docker Git Scikit-learn

Remote Hybrid

Save

Engineering Manager, Safety

Remote (US) 8 days ago $217,000–$303,900

AI ML Python Docker Kubernetes CI/CD PostgreSQL AWS Terraform Prometheus Grafana

Remote

Save

Machine Learning Safety: Evaluation Research Engineer

Apple Inc

San Francisco, CA 56 days ago $181,100–$318,400

Python SQL Terraform Git Jupyter CI/CD Docker Kubernetes Prometheus Grafana AWS Google Cloud Platform Azure PostgreSQL MongoDB GitHub Confluence Jira Scrum Agile TensorFlow PyTorch

Save

Machine Learning Safety: Evaluation Research Engineer

Apple Inc

Seattle, WA 56 days ago $171,600–$302,200

Python SQL Terraform Git CI/CD Docker Kubernetes AWS Google Cloud Platform Azure PostgreSQL MLOps NLP TensorFlow PyTorch Scikit-learn Jupyter Notebook GitHub Confluence Tableau Prometheus Grafana

Save