Staff Observability Platform Engineer (SRE)

CVS Health

Remote Actively hiring
Scottsdale, USA · Remote, USA Posted 18 days ago $118,450$236,900 / year

At a glance

AI generated

TL;DR

As a Lead Platform Reliability Engineer at CVS Health PBM, you will join a dynamic and innovative team to enhance the reliability and performance of cloud-based systems through the design and implementation of metrics frameworks, observability solutions, and automated quality gates. Your daily tasks include defining SLOs and SLIs, managing error budgets, and utilizing tools like Prometheus, Grafana, Loki, and Temp for real-time monitoring. You will also architect scalable cloud infrastructure using Docker, Kubernetes, and Argo CD while ensuring compliance with public cloud platforms such as AWS or GCP. With a strong background in Java or Python, OpenTelemetry, and distributed data pipelines, you will drive automation initiatives within the release engineering process and lead incident management efforts to ensure continuous system health and performance optimization.

Skills

Prometheus Grafana Kubernetes AWS Python Java OpenTelemetry PostgreSQL Docker CI/CD Terraform MySQL Loki Tempo

What you'll do

  • Define and maintain key performance metrics, SLOs, and SLIs to measure system reliability and performance.
  • Manage error budgets effectively by analyzing incidents and outages to inform adjustments.
  • Design comprehensive monitoring solutions using tools like Prometheus, Grafana, Loki for real-time visibility.
  • Architect scalable cloud infrastructure supporting multiple business applications for future growth.
  • Develop automated quality gates ensuring all releases meet defined reliability and performance standards.
  • Assist in incident response efforts by providing insights from metrics and monitoring tools.
  • Conduct post-mortem analyses to identify root causes and recommend preventive measures.

What we're looking for

  • 10+ years of experience in Software Engineering, Platform Engineering, or SRE.
  • 7+ years of expertise in observability practices including SLIs/SLOs/SLAs and incident management.
  • 7+ years building production-grade backend services using Java/python.
  • 7+ years implementing OpenTelemetry and operating cloud-native platforms like Docker and Kubernetes.
  • 5+ years designing and scaling distributed, high-volume data pipelines with relational databases.

Market check

Salary context

This $118,450–$236,900 range sits above 35% of similar postings on FindRole.

Peer median band

$143,000$251,750

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$165,187$217,725

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About CVS Health

CVS Health is a leading American healthcare company operating retail pharmacies, pharmacy benefit management services, and a health insurance segment through Aetna, one of the nation''s largest health insurers. Industry: Healthcare & Pharmacy

CVS Health currently has 89 open roles on FindRole.

Listed pay typically runs $118,450–$260,590 across 86 roles with salary data.

Most-posted roles

View all roles at CVS Health

More like this

Similar roles

Staff Observability Data Infrastructure Engineer

CVS Health

Remote (Work At Home-Maryland, US) 51 days ago $130,295$260,590
Databricks OpenTelemetry Cribl Vector Jenkins GitHub Actions Delta Lake Apache Spark AWS Azure GCP SQL Python Splunk Datadog Elastic Kafka Terraform CI/CD Kubernetes
Remote

Staff, Software Engineer (SRE)

Walmart

(Usa) Vizio Services Denver Co Denver Home Office, US 60 days ago $121,000$242,000
Terraform GitHub Actions Jenkins Liquibase Flyway IaC CI/CD Docker Kubernetes AWS Python PostgreSQL SQL Git Ansible Prometheus Grafana

SRE Systems Engineer

Salesforce

Remote (Virginia - Washington Dc Metro - Remote, US) 10 days ago
AWS Python Unix TCP/IP ITIL Kubernetes Jenkins Puppet Go CI/CD Docker Prometheus Grafana Chef Spinnaker Linux+ Red_Hat_certifications
Remote

Senior Staff Engineer – Performance and Observability

GEICO

Remote (Tx Dallas Greenville Office, US) 9 days ago $110,000$260,000
OpenStack Python Ansible Kubernetes Docker Grafana Prometheus CI/CD Jenkins GitHub Actions ArgoCD Terraform AWS GCP Azure QEMU libvirt
Remote

Sr Staff Software Engineer - Control Systems

GE Aerospace

Evendale, US 18 days ago
Python MATLAB Simulink NPSS AWS Kubernetes Git CI/CD PostgreSQL Docker Terraform JSON YAML REST SCADA PLC RTOS FPGA DSP DO-178C Model-Based Design

Software Engineer - SRE

General Motors (GM)

Remote (Austin Technical Center - Austin Technical Center, US) 36 days ago
PostgreSQL Python Terraform Kubernetes CI/CD Prometheus OpenTelemetry AWS Git Docker Oracle SQL Server Azure GCP FiveTran GoldenGate Cosmos NoSQL
Remote