Principal Site Reliability Engineer, Infrastructure Observability

T. Rowe Price

Hybrid Actively hiring

Owings Mills, MD · Colorado · Washington Posted 76 days ago $159,000–$272,000 / year

View original post Log in to save

At a glance

AI generated

TL;DR

As a Principal Site Reliability Engineer at T. Rowe Price, you will lead a team focused on enhancing observability and reliability for the company’s cloud and on-prem solutions. Your daily tasks include designing technology solutions to prevent service disruptions, fostering a blameless post-mortem culture, and driving SRE methodologies across operations teams. You will leverage automation and best-of-breed tools like New Relic, Prometheus, and Terraform to ensure system stability and scalability. The role requires extensive experience in cloud environments (AWS preferred), DevOps practices, CI/CD toolchains, and incident response management. Ideal candidates possess deep expertise in programming languages such as Python or Java, database development skills, and the ability to define and track Service Level Objectives. This position demands strategic thinking, independent problem-solving, and strong communication skills to engage with diverse stakeholders across a complex, distributed technology environment.

Skills

AWS Python PostgreSQL CI/CD Prometheus Grafana Terraform Ansible New Relic SolarWinds DPA Elastic Stack Splunk DevOps SRE Chaos Engineering SQL Server Node.js .Net Core Java Go

What you'll do

Designs technology solutions to prevent or minimize service disruptions in cloud environments.
Leads internal change initiatives to adopt SRE methodologies across operations teams.
Analyzes incidents for high-level trends and drives strategic growth within Global Technology.
Implements chaos engineering models at scale to improve system resilience and reliability.
Standardizes dashboards and tools for observability, APM, and infrastructure monitoring.
Defines Service Level Objectives (SLOs) and manages error budgets to track system availability.

What we're looking for

10+ years of experience designing and operating cloud infrastructure with senior-level impact.
Extensive experience building and supporting solutions in Amazon AWS and running DevOps/SRE functions.
Demonstrable experience implementing new technology, tools, and platforms, including automation for incident prevention/remediation.
Proficiency with multiple programming languages (Python, Java, GO, Node.js, .Net Core) and database development (SQL Server, PostgreSQL, MySQL).
Knowledge of observability and cloud management tools (New Relic, SolarWinds DPA, Elastic Stack, Prometheus, Grafana, Splunk, Ansible, Terraform, Vault, Vagrant).

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $175k

This role $216k

$113k most similar roles pay here $289k

This role pays more than 76% of similar roles. Most pay $138,446–$212,125 — the shaded band above. At the midpoint, this role pays about $216k versus about $175k for comparable roles.

Based on 239 similar postings.

Employer

About T. Rowe Price

T. Rowe Price is an asset management firm focused on delivering global investment management excellence and retirement services

T. Rowe Price currently has 20 open roles on FindRole.

Listed pay typically runs $133,000–$226,500 across 20 roles with salary data.

Most-posted roles

View all roles at T. Rowe Price

Similar roles

Site Reliability Engineer

Autodesk

Atlanta, GA 13 days ago $117,000–$209,330

AWS Kubernetes Terraform Python Linux Bash Docker CI/CD Jenkins Git CloudWatch Splunk Dynatrace New Relic Grafana PostgreSQL MySQL MSSQL EC2 ECS EKS Lambda ELB S3 IAM VPC DynamoDB RDS

Save

Site Reliability Engineer

Booz Allen Hamilton

Herndon, VA 37 days ago $86,800–$198,000

Java Spring Boot CI/CD Agile Bitbucket GitLab Kubernetes NiFi Kafka MongoDB Elasticsearch ArgoCD

Save

Principal Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Fl - Disney'S Hollywood Studios - Feature Animation Building, US) 54 days ago

AWS Azure GCP Terraform CloudFormation Ansible Chef CI/CD Docker Kubernetes Prometheus Grafana Python Linux Windows AI LLM PCI DevOps SRE SLI SLO SLA

Remote

Save

Principal Site Reliability Engineer

The Walt Disney Company

Remote (Bay Lake, FL) 47 days ago

Akamai Kona Site Defender WAF Bot Manager DevOps CI/CD Python Go Docker Terraform AWS Azure Google Cloud PostgreSQL MongoDB Redis Prometheus Grafana Kubernetes Ansible Jenkins GitLab GitHub

Remote

Save

Principal Site Reliability Engineer - Observability and Telemetry Platform

Nvidia

Remote (Santa Clara, CA) 11 days ago $248,000–$396,750

Kubernetes Python Go Docker Grafana OpenTelemetry Prometheus Linux Networking Containers CI/CD Terraform AWS Azure Google Cloud Platform PostgreSQL MySQL Ansible SaltStack Bash Git Jenkins

Remote

Save

Sr Principal Site Reliability Engineer

The Walt Disney Company

Remote (Usa - Ca - Market St, US) 57 days ago $250,500–$335,900

Kubernetes AWS CI/CD Docker Prometheus Grafana Python PostgreSQL Terraform Ansible GitOps CDN integration media streaming technologies content delivery strategies

Remote

Save