Manager, Site Reliability Engineering

Okta Inc

Hybrid

Quick summary

Work type: Hybrid
Location: San Francisco, CA
Salary: $204,000–$281,000 / yr
Posted: 37 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $192k

This role $242k

$130k most similar roles pay here $297k

This role pays more than 87% of similar roles. Most pay $163,165–$219,875 — the shaded band above. At the midpoint, this role pays about $242k versus about $192k for comparable roles.

Based on 239 similar postings.

Employer

About Okta Inc

Okta, Inc. is an American identity and access management company based in San Francisco. It provides cloud software that helps companies manage and secure user authentication into applications, and for developers to build identity controls into applications, websites, web services, and devices.[

Okta Inc currently has 156 open roles on FindRole.

Listed pay typically runs $186,000–$253,000 across 156 roles with salary data.

Most-posted roles

View all roles at Okta Inc

At a glance

TL;DR · Manager, Site Reliability Engineering

Role Posting Log in to save

As the Manager of Infrastructure Platform and Shared Services within Okta’s Site Reliability Engineering Group in San Francisco, you will lead a team responsible for maintaining high availability and reliability across Okta's Identity-as-a-Service (IDaaS) platform. Your daily responsibilities include driving microservice adoption, enhancing DevOps practices, and developing robust self-healing patterns to support the platform's scalability. You will also mentor engineers, manage service expectations, and ensure compliance with industry best practices. The role requires expertise in cloud-native architectures, Kubernetes, Terraform for infrastructure as code, and CI/CD pipelines, along with hands-on experience in software development and observability tools like Grafana and Splunk. This position is crucial for a company that authenticates millions of users daily on AWS, emphasizing the need for reliable, efficient, and cost-effective infrastructure solutions at scale.

Skills

AWS Kubernetes Terraform CI/CD Grafana Splunk APM Agile DevOps

What you'll do

Manage a team of SREs supporting various workloads and teams for the IDaaS platform.
Drive microservice journey and DevOps maturity to enhance workload reliability.
Develop powerful tooling, intuitive self-service capabilities, and robust self-healing patterns.
Lead and mentor high-performing engineers and managers across multiple domains.
Improve SDLC processes for Cloud infrastructure as code and CI/CD pipelines.
Maintain deep knowledge of industry best practices, trends, and technologies.

What we're looking for

3+ years of experience in technical leadership and people management
Extensive experience using Agile and DevOps methodologies for large-scale infrastructure
Strong expertise in cloud-native architectures, Kubernetes, Terraform, and CI/CD pipelines
Deep hands-on experience with software development, PaaS, and automation
Experience running large-scale infrastructure platforms supporting SaaS/Cloud services on AWS
Effective verbal, written communication skills and strong interpersonal abilities
Computer Science Degree or equivalent technical experience

Similar roles

Senior Manager, Site Reliability Engineering

Okta Inc

DC 31 days ago $207,000–$284,900

AWS Kubernetes Terraform CI/CD Grafana Splunk APM Agile DevOps Python Go Docker Prometheus

Hybrid

Save

Senior Manager, Site Reliability Engineering

Oracle

Reston, VA +2 50 days ago

Kubernetes Docker CI/CD AWS Python PostgreSQL Prometheus Grafana Terraform Git Ansible Nginx Linux RESTful_APIs JSON YAML SLO SLA DevOps Scalability

Save

Director of Site Reliability Engineering

JPMorgan Chase

Palo Alto, CA 8 days ago $204,250–$285,000

AI Python Kubernetes Terraform CI/CD PostgreSQL Docker Prometheus Grafana AWS GitOps

Save

Principal Site Reliability Engineering Manager

Microsoft

71 days ago $142,800–$274,800

Azure Kubernetes Docker CI/CD Prometheus Grafana Python Go PostgreSQL Terraform AWS GitOps SLOs SLIs Observability MetricstoLogsTracing BlamelessPostIncidentReviews SelfHealingSystems SafeRollouts AutomatedRemediation

Save

Senior Manager of Site Reliability Engineering

JPMorgan Chase

Jersey City, NJ 11 days ago $171,000–$260,000

Python Java Spring Boot .Net Jenkins GitLab Terraform Kubernetes Docker ECS CI/CD AI Data Fluency Post-mortem Analysis Blameless Culture Security Controls Resiliency Practices

Save

Senior Manager of Site Reliability Engineering

JPMorgan Chase

Houston, TX 11 days ago

Python Jenkins GitLab Terraform Kubernetes Docker ECS CI/CD Post-mortem AI Traceability Auditability Resiliency Security Site Reliability Engineering

Save