Annotation Data Scientist, Evaluation Integrity (Siri)

Apple Inc

Quick summary

Work type
On-site
Location
Cambridge, MA
Salary
$154,600–$274,900 / yr
Posted
17 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $181k
This role $215k
$118k most similar roles pay here $292k

This role pays more than 69% of similar roles. Most pay $135,000–$227,262 — the shaded band above. At the midpoint, this role pays about $215k versus about $181k for comparable roles.

Based on 240 similar postings.

Employer

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

Apple Inc currently has 638 open roles on FindRole.

Listed pay typically runs $171,600–$272,100 across 505 roles with salary data.

Most-posted roles

View all roles at Apple Inc

At a glance

TL;DR · Annotation Data Scientist, Evaluation Integrity (Siri)

Join the Evaluation Integrity team as an Annotation Data Scientist to revolutionize how Siri is evaluated by designing human-in-the-loop (HITL) annotation tasks that scrutinize user agent personae, conversations, and automated evaluators against product specifications. You will own end-to-end annotation initiatives, from rubric design and tooling through data analysis, ensuring human judgments inform pre-ship decisions rigorously. Utilizing Python for data processing and analysis, you will manage multiple concurrent projects, collaborate with software engineers on custom tooling, and refine guidelines based on inter-annotator agreement. Ideal candidates have a quantitative background and 5+ years of experience in machine learning evaluation methodologies, along with expertise in statistical methods and large-scale dataset management. This role requires strong communication skills to work effectively across functions and deliver actionable insights for product teams.

What you'll do

  • Design HITL annotation tasks to assess the quality of user agent personae and validity of conversations.
  • Author and maintain rubrics for human grading aligned with agentic evaluators and product guidelines.
  • Manage multiple annotation programs end-to-end from requirements gathering to stakeholder delivery.
  • Develop custom annotation tooling in partnership with software engineers to support evaluation tasks.
  • Apply data science techniques to analyze human-labeled data, measuring evaluator accuracy and reliability.
  • Translate annotator feedback into improvements for user agents and automated evaluators.

What we're looking for

  • Bachelor's or Master's degree in a quantitative field or equivalent experience.
  • 5+ years of hands-on experience with human-in-the-loop evaluation methodologies for ML systems.
  • Expertise in Python for data processing, analysis, and prototyping using libraries like pandas and Jupyter.
  • Experience designing and implementing annotation schemas and rubrics for machine learning training or evaluation.
  • Ability to manage multiple concurrent dataset curation efforts, including coordinating with annotators and monitoring performance metrics.

More like this

Similar roles

Sr. Machine Learning Research Engineer, Siri Speech

Apple Inc

Cupertino, CA 20 days ago $181,100$318,400
Python TensorFlow PyTorch Keras Scikit-learn CUDA C++ Java Swift Docker Kubernetes CI/CD AWS Azure Google Cloud Platform PostgreSQL MongoDB Redis Git Jupyter Notebook Prometheus

Sr. Software Engineer - Data, Siri Speech

Apple Inc

Cambridge, MA 50 days ago $132,100$244,600
Python CI/CD Apache Beam Apache Spark Dask Ray Kubernetes AWS PostgreSQL MongoDB Git Jenkins Prometheus Grafana Docker Terraform GitHub Swagger/OpenAPI

Sr. Software Engineer - Data, Siri Speech

Apple Inc

Cupertino, CA 24 days ago $147,400$272,100
Python CI/CD Apache Beam Apache Spark Dask Ray PostgreSQL Kubernetes AWS Google Cloud Platform Azure Terraform Git Jenkins Docker Prometheus Grafana