ML Safety Engineer

Apple Inc

Quick summary

Work type
On-site
Location
San Francisco, CA
Salary
$181,100–$272,100 / yr
Posted
56 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $196k
This role $227k
$133k most similar roles pay here $287k

This role pays more than 72% of similar roles. Most pay $154,625–$236,900 — the shaded band above. At the midpoint, this role pays about $227k versus about $196k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 1723 open roles on FindRole.

Listed pay typically runs $162,500–$272,100 across 1398 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · ML Safety Engineer

As an ML Research Engineer at Apple Services Engineering, you will join a dedicated team focused on ensuring the safe and trustworthy development of AI features across various platforms. Your primary responsibilities include designing automated safety benchmarking methodologies, developing evaluation frameworks for assessing risks posed by media-related agents, and creating scalable tools to help engineering teams improve model performance responsibly. You will work with ML and data scientists, software developers, and project managers to translate complex requirements into practical solutions, ensuring that AI experiences meet high standards of reliability and alignment with user expectations. This role requires strong proficiency in Python (pandas, NumPy, Jupyter, PyTorch), experience with large datasets and model evaluation pipelines, and a deep understanding of responsible AI practices. You will also contribute to the development of benchmark datasets and metrics for multi-turn interactions, applying rigorous statistical methods to ensure reproducibility and relevance in your work.

What you'll do

  • Design scientifically-grounded benchmarking methodologies for AI/ML models.
  • Develop automated evaluation pipelines to assess model outputs at scale.
  • Create datasets representing realistic and adversarial use cases across domains.
  • Define and validate new metrics for complex interaction patterns in AI systems.
  • Apply statistical rigor to ensure reproducibility of evaluation frameworks.
  • Translate experimental findings into actionable improvements for safety.

What we're looking for

  • Advanced degree (MS or PhD) in Computer Science, Software Engineering, or equivalent research/work experience.
  • 1+ years of work experience either as a postdoc or in the industry.
  • Strong background in empirical evaluation, experimental design, and benchmarking.
  • Proficiency in Python with libraries like pandas, NumPy, Jupyter, PyTorch.
  • Experience evaluating AI/ML models, especially LLMs or program synthesis systems.
  • Deep familiarity with software engineering workflows and developer tools.

More like this

Similar roles

ML Safety Engineer

Apple Inc

San Francisco, CA 50 days ago $181,100$272,100
Python PyTorch pandas NumPy Jupyter AI/ML models LLMs program synthesis systems RAG systems reinforcement learning agentic architectures model fine-tuning CI/CD large datasets annotation tools hallucination detection model alignment taxonomies categorization schemes structured labeling frameworks

ML Engineer

McDonald’s Corporation

Chicago, Illinois 25 days ago $129,800$165,490
Python Java Scala Apache Airflow Luigi Hadoop Spark NoSQL Data质量管理 数据产品生命周期管理 数据仓库原则 CI/CD Mentorship 跨职能团队协作 大数据生态系统 数据治理能力 数据质量功能 数据标准化 数据解析 去重 层级管理

Director of Safety ML

Reddit

Remote (US) 8 days ago $276,700$387,400
Python TensorFlow PyTorch Kubernetes AWS LLMs Transformer Models NLP CI/CD MLOps Prometheus PostgreSQL Docker Git Scikit-learn
Remote Hybrid

Engineering Manager, Safety

Reddit

Remote (US) 8 days ago $217,000$303,900
AI ML Python Docker Kubernetes CI/CD PostgreSQL AWS Terraform Prometheus Grafana
Remote

Machine Learning Safety: Evaluation Research Engineer

Apple Inc

San Francisco, CA 56 days ago $181,100$318,400
Python SQL Terraform Git Jupyter CI/CD Docker Kubernetes Prometheus Grafana AWS Google Cloud Platform Azure PostgreSQL MongoDB GitHub Confluence Jira Scrum Agile TensorFlow PyTorch

Machine Learning Safety: Evaluation Research Engineer

Apple Inc

Seattle, WA 56 days ago $171,600$302,200
Python SQL Terraform Git CI/CD Docker Kubernetes AWS Google Cloud Platform Azure PostgreSQL MLOps NLP TensorFlow PyTorch Scikit-learn Jupyter Notebook GitHub Confluence Tableau Prometheus Grafana