Senior Site Reliability Engineer, Core AI Infrastructure

Coinbase

Remote

Quick summary

Work type
Remote
Location
Oakland, CA
Salary
$186,065–$218,900 / yr
Posted
9 days ago

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $203k
This role $202k
$154k most similar roles pay here $269k

This role pays less than 52% of similar roles. Most pay $168,450–$236,900 — the shaded band above. At the midpoint, this role pays about $202k versus about $203k for comparable roles.

Based on 240 similar postings.

Employer

About Coinbase

Coinbase Global is a publicly traded cryptocurrency exchange platform where consumers can buy, sell, and store digital currencies including Bitcoin, Ethereum, and hundreds of other cryptocurrencies. Industry: Cryptocurrency Exchange & Financial Technology

Coinbase currently has 59 open roles on FindRole.

Listed pay typically runs $193,970–$228,200 across 53 roles with salary data.

Most-posted roles

View all roles at Coinbase

At a glance

TL;DR · Senior Site Reliability Engineer, Core AI Infrastructure

As a Senior Site Reliability Engineer on Coinbase’s IT Operations team, you will be responsible for ensuring the reliability and automation of critical AI infrastructure. Your day-to-day duties include owning the monitoring and incident response lifecycle for AI services, building automation tools to streamline workflows, and partnering with various teams to integrate surveillance tooling into deployment pipelines. You will also enhance observability standards by defining metrics and implementing monitoring solutions while developing full-stack applications using Go or Python. This role requires 5+ years of experience in cloud infrastructure (AWS) and network environments, proficiency in scripting languages like Python or Bash, and hands-on use of tools such as Terraform and Kubernetes. The position offers direct exposure to senior leadership within a fast-paced, high-growth company focused on AI transformation at scale.

What you'll do

  • Own the reliability and monitoring of AI infrastructure services, including on-call support.
  • Build automation and tooling to streamline operational workflows in CI/CD frameworks.
  • Extend CI/CD frameworks for IT services and integrate surveillance tools with Security.
  • Strengthen observability by defining metrics and implementing monitoring solutions.
  • Develop full-stack applications that power internal AI products using Go or Python.

What we're looking for

  • 5+ years of experience automating and supporting cloud infrastructure (AWS) with tools like Terraform.
  • Proven expertise in deploying, managing, and troubleshooting containerized workloads using Docker and Kubernetes.
  • Proficiency in scripting or programming languages such as Python, Bash, Ruby, or Go for automation tasks.
  • Strong track record of leading incident response and improving reliability in environments with strict SLAs.
  • Experience building automation and tooling to streamline operational workflows across CI/CD frameworks.

More like this

Similar roles

Senior AI Site Reliability Engineer

Oracle

US 20 days ago
AWS Azure OCI Kubernetes Terraform Python Java Go Docker Prometheus Grafana CI/CD Vertica Snowflake Tableau Power BI Oracle Analytics LangChain AutoGPT Jenkins

Senior Site Reliability Engineer, AIOPs

Nvidia

Santa Clara, CA 37 days ago $148,000$235,750
Kubernetes Terraform Python Helm CI/CD Docker Prometheus Grafana Bash Linux Networking Apache Kafka Pulsar Flink Spark ClickHouse Elasticsearch TimescaleDB AWS Azure Google Cloud Platform

Senior AI Site Reliability Developer 3

Oracle

US 5 days ago
Python Java Go Terraform Docker Kubernetes AWS Azure OCI Vertica Tableau Power BI Oracle Analytics LangChain AutoGPT CI/CD MULTI-CLOUD GENAI LLMs

Senior Staff Engineer, AI Infrastructure

Samsung Electronics

Remote (3655 N 1St St, San Jose, Ca, Usa, US) 20 days ago $180,200$297,200
Kubernetes Docker Slurm LSF Python C++ PyTorch TensorFlow MLOps Data pipelines Large-scale data processing Storage architectures Pipeline orchestration PCIe NVLink InfiniBand GPU physical design Design verification CAD environments
Remote