Sr Machine Learning Engineer, Tech Lead — Autograder Systems, Evaluation

Cupertino, California, USA Posted 5 days ago

$181,100 - $318,400/year

Role Details

In this role you will focus on: Technical Leadership * Define and drive the technical roadmap for autograder quality — researching and introducing novel methods such as reward modeling, LLM-as-judge, preference learning, and calibration techniques to measurably improve evaluation accuracy. * Architect and lead the build-out of a scalable autograder training pipeline encompassing data curation, model fine-tuning, evaluation harnesses, and versioning. * Design and own the hillclimbing system that iteratively improves autograder performance through systematic prompt and model optimization loops. * Establish quality benchmarks, confidence metrics, and failure analysis frameworks that enable the team to track, trust, and act on autograder outputs. People & Collaboration * Mentor and technically guide a team of MLEs through design reviews, modeling standards, and hands-on problem-solving — fostering a culture of rigor and continuous learning. * Partner with data annotation teams to define labeling guidelines that feed autograder training. * Collaborate with feature engineers to align autograder signals with broader training and product objectives. * Translate complex technical trade-offs into clear narratives for engineering, product, and leadership audiences. Master's or PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field. 5+ years of industry experience in machine learning, with a strong focus on LLM or VLM systems. Deep expertise in prompt-tuning and fine-tuning techniques (SFT, RLHF, DPO, or equivalent), with proven experience of model calibration and uncertainty estimation. Familiarity with data flywheel design — leveraging model outputs to continuously improve future training data. Proficiency in Python and ML frameworks (PyTorch preferred). Strong ML systems instincts — you care deeply about data quality, reproducibility, latency, and scale. Background in human-in-the-loop annotation pipelines and inter-annotator agreement analysis. Prior experience on an evaluation infrastructure or model quality team.

For more details click Job Post.

About Apple Inc

Apple Inc. is a multinational technology company known for designing and manufacturing consumer electronics, software, and online services, including the iPhone, Mac, iPad, and App Store. Industry: Consumer Electronics & Software

View All Jobs →