Staff Machine Learning Systems Engineer

Reddit

Remote

Quick summary

Work type
Remote
Location
Remote
Salary
$230,000–$322,000 / yr
Posted
today

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $221k
This role $276k
$162k most similar roles pay here $339k

This role pays more than 83% of similar roles. Most pay $182,368–$259,212 — the shaded band above. At the midpoint, this role pays about $276k versus about $221k for comparable roles.

Based on 240 similar postings.

Employer

About Reddit

Reddit is a social news aggregation and discussion platform where users share content, vote on posts, and engage in community conversations across thousands of interest-based forums called subreddits.

Reddit currently has 72 open roles on FindRole.

Listed pay typically runs $217,000–$303,900 across 66 roles with salary data.

Most-posted roles

View all roles at Reddit

At a glance

TL;DR · Staff Machine Learning Systems Engineer

As a Staff ML Infrastructure Engineer on Reddit’s Machine Learning Platform team, you will lead the development of large-scale machine learning platforms and tools that enhance model lifecycle management for teams across the organization. Your responsibilities include designing MLOps patterns to streamline model training and deployment, developing graph ML codebases, optimizing batch data processing with Apache Beam, Spark, and Ray Data, and architecting pipelines for massive graph data structures. You will need extensive experience in cloud-based technologies like GCP BigQuery, Google Cloud Storage, and Terraform, as well as proficiency in Python, PyTorch, TensorFlow, and distributed training frameworks such as Ray and Kubernetes. Additionally, familiarity with graph databases (Neo4j) and neural networks (GNNs) is highly beneficial for tackling the complex data challenges at Reddit’s scale.

What you'll do

  • Design end-to-end model lifecycle patterns to boost ML engineer development velocity.
  • Develop and support a graph ML codebase for greater model scalability and iteration.
  • Collaborate on performance tuning of models in a large distributed training environment.
  • Optimize batch data processing within a data warehouse using Apache Beam, Spark, etc.
  • Architect pipelines to build massive graph data structures with billions of nodes.

What we're looking for

  • 8+ years of experience in ML infrastructure and model deployments.
  • Hands-on expertise with cloud-based technologies for ML platforms like GCP BigQuery, Google Cloud Storage, Terraform.
  • Deep experience with MLOps tools such as MLflow or Wandb for experiment tracking and model serving.
  • Proficiency in distributed training frameworks including Ray and Kubernetes.
  • Strong skills in common programming languages and frameworks of ML, such as Python, PyTorch, TensorFlow.

More like this

Similar roles

Staff Machine Learning Engineer

Intuit

Mountain View, CA 52 days ago $202,500$274,000
Python Scikit-learn NLTK NumPy Pandas TensorFlow Keras R Spark SQL Git AWS GCP CI/CD

Staff Machine Learning Engineer

Arm Holdings

Austin, TX 52 days ago $249,900$338,100
Python TensorFlow PyTorch GPU ARM ML Model Optimization Deep Learning Computer Architecture CI/CD
Hybrid

Staff Machine Learning Engineer

Intuit

Mountain View, California 48 days ago $197,000$266,500
Python Scikit-learn NLTK NumPy Pandas TensorFlow Keras R Spark SQL Git AWS GCP CUDA cuDNN

Staff Machine Learning Engineer

PayPal

San Jose, CA 22 days ago $196,500$291,500
TensorFlow PyTorch scikit-learn AWS Azure GCP BERT GPT T5 reinforcement_learning GCN GraphSAGE GAT semi_supervised_learning self_supervised_learning unsupervised_representation_learning causal_inference anomaly_detection incremental_learning synthetic_data_generation fraud_detection risk_modeling
Hybrid