Staff Research Engineer, Post-training & Evaluation

Remote

Quick summary

Work type: Remote
Location: Remote
Salary: $230,000–$322,000 / yr
Posted: 3 days ago

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $205k

This role $276k

$144k most similar roles pay here $341k

This role pays more than 94% of similar roles. Most pay $174,200–$235,625 — the shaded band above. At the midpoint, this role pays about $276k versus about $205k for comparable roles.

Based on 240 similar postings.

Employer

About Reddit

Reddit is a social news aggregation and discussion platform where users share content, vote on posts, and engage in community conversations across thousands of interest-based forums called subreddits.

Reddit currently has 94 open roles on FindRole.

Listed pay typically runs $217,000–$303,900 across 65 roles with salary data.

Most-posted roles

View all roles at Reddit

At a glance

TL;DR · Staff Research Engineer, Post-training & Evaluation

Apply Now Log in to save

As a Staff Research Engineer for Post-Training & Evaluation Science at Reddit, you will join the AI Engineering team to develop and refine foundational Large Language Models (LLMs) that understand Reddit's unique culture. Your primary responsibilities include defining the "Reddit Benchmark" evaluation standard, ensuring reliability in model evaluations, designing post-training recipes, and partnering with Safety Engineering to translate policies into concrete metrics. You will work extensively with Python, Hugging Face Transformers, vLLM, and lm-eval-harness, while also contributing to synthetic data generation strategies and diagnosing post-training instability. This role requires deep expertise in evaluation reliability, custom domain-specific evaluation harnesses, and a comprehensive understanding of LLMs' post-training processes, making it ideal for those with extensive ML experience or a relevant PhD.

Skills

Python Hugging Face Transformers lm-eval-harness PyTorch FSDP2 DeepSpeed ZeRO-3 MMLU GSM8K LightEval CI/CD vLLM Axolotl TorchTune TorchTitan

What you'll do

Define the "Reddit Benchmark" evaluation standard for model quality.
Ensure evaluation reliability and statistical rigor in benchmarking models.
Design methodologies for automated model-as-a-judge evaluations.
Set post-training recipes to convert base models into high-performing endpoints.
Evaluate base and CPT checkpoints to select optimal starting points.
Drive the strategy for generating synthetic data to improve model generalization.

What we're looking for

6+ years of professional ML experience or PhD + 4 years in related field.
Deep expertise in evaluation reliability and statistical rigor for automated evaluations.
Strong experience building custom, domain-specific evaluation harnesses.
Experience evaluating both generation and representation/classification models.
Fluency in Python with strong data-pipeline and eval-harness engineering skills.

Similar roles

Applied Research Engineer

Salesforce

Remote (San Francisco, CA) +4 8 days ago $148,500–$260,100

Python AWS Kubernetes Linux React GCP CI/CD Docker Prometheus PostgreSQL Git Jenkins Terraform GraphQL Redis MongoDB CICD Security_principles UI_design_sensibilities

Remote

Save

Applied Research Engineer

Salesforce

Remote (San Francisco, CA) +4 7 days ago $197,300–$313,700

Python AWS Kubernetes Linux React GCP CI/CD Docker Prometheus PostgreSQL Git Jenkins Terraform GraphQL Redis MongoDB CICD Security UI_design_sensibilities

Remote

Save

System Performance Engineer, Staff

Qualcomm

San Diego, CA 47 days ago $148,300–$222,500

Python C/C++ ARM_v8 ARM_v9 Vulkan OpenGL DX12 CUDA SMMU GIC Coresight-PMU Linux Windows Android Memory_hierarchy System_interconnects Power_management_stacks Scheduler_behavior Performance_analysis_tools CI/CD

Save

Staff Applications Engineer

Broadcom

San Jose, CA 124 days ago $120,000–$192,000

Python JavaScript C LLM APIs Vertex AI Dialogflow CX BigQuery CI/CD Docker Kubernetes Terraform PostgreSQL Networking Protocols ASIC Development SDK Development

Save

Staff Implementation Engineer

Arm Holdings

Austin, TX 3 days ago $198,100–$268,000

Python C Tcl git EDA tools RC analysis STA PDN analysis multi-die SoC design flows IR-PDN-Thermal bottlenecks large scale automation version control systems distributed processing disk management PVT analysis Vdrop analysis thermal aware methodology

Hybrid

Save

Senior Staff Engineer

GEICO

Remote (Bethesda, MD) 34 days ago $115,000–$260,000

Go Python .Net SQL NoSQL Kubernetes AWS GCP Azure Terraform Puppet Chef Ansible CI/CD DevOps Docker Prometheus Grafana Git Jenkins

Remote

Save