Staff Site Reliability Engineer - Observability | Okta

Actively hiring Verified listing

Bellevue, WA · Chicago, IL · New York, NY · San Francisco, CA · Washington, DC Posted 85 days ago $194,000–$267,000 / year

Job Posting Log in to save

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $184k

This role $230k

$120k most similar roles pay here $283k

This role pays more than 85% of similar roles. Most pay $155,500–$211,725 — the shaded band above. At the midpoint, this role pays about $230k versus about $184k for comparable roles.

Based on 239 similar postings.

Employer

About Okta Inc

Okta, Inc. is an American identity and access management company based in San Francisco. It provides cloud software that helps companies manage and secure user authentication into applications, and for developers to build identity controls into applications, websites, web services, and devices.[

Okta Inc currently has 145 open roles on FindRole.

Listed pay typically runs $194,000–$267,000 across 145 roles with salary data.

Most-posted roles

View all roles at Okta Inc

At a glance

TL;DR

As a Site Reliability Engineer specializing in Observability for Google Cloud, you will join our dedicated team to enhance and scale our Observability ecosystem on GCP. Your daily tasks include designing and automating scalable observability infrastructure using Terraform, optimizing data collection and processing for Splunk and Grafana services, participating in incident response rotations, and eliminating manual tasks through automation. You must have extensive experience with Google Kubernetes Engine (GKE) and expertise in creating actionable dashboards with Splunk or Grafana. Additionally, you should possess strong coding skills in Python or Go, a deep understanding of distributed systems, and a data-driven approach to problem-solving. Familiarity with OpenTelemetry and Grafana Loki is beneficial, as is experience managing observability tools on AWS.

Skills

Google Cloud Terraform Go Python Ruby Splunk Grafana Kubernetes Linux TCP/IP DNS Load Balancing OpenTelemetry Grafana Loki AWS

What you'll do

Design, build, and maintain scalable observability infrastructure using Terraform.
Optimize collection, processing, and storage of Observability data in GCP.
Participate in on-call rotations and lead post-incident reviews for continuous improvement.
Automate deployment and scaling of observability agents and collectors to reduce manual effort.
Create intuitive Splunk or Grafana dashboards correlating data from multiple sources.

What we're looking for

Minimum 5+ years of experience scaling and managing observability in Google Cloud Platform.
Expertise in creating intuitive, actionable Splunk or Grafana dashboards correlating data from multiple sources.
At least 3 years of SRE, DevOps, or Systems Engineering experience focusing on high-availability systems.
Strong coding skills in Python, Go for building internal tools and automating workflows.
Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and Kubernetes/GKE orchestration.
Experience with Terraform to design, build, and maintain scalable observability infrastructure.

Similar roles

Staff Site Reliability Engineer - Observability | Okta

Okta Inc

Bellevue, WA 98 days ago $194,000–$267,000

Splunk Terraform Go Python Ruby SPL Kubernetes AWS GCP Linux TCP/IP DNS OpenTelemetry Docker CI/CD

Save

Principal Site Reliability Engineer, Infrastructure Observability

T. Rowe Price

Owings Mills, MD 77 days ago $159,000–$272,000

AWS Python PostgreSQL CI/CD Prometheus Grafana Terraform Ansible New Relic SolarWinds DPA Elastic Stack Splunk DevOps SRE Chaos Engineering SQL Server Node.js .Net Core Java Go

Hybrid

Save

Site Reliability Engineer

Autodesk

Atlanta, GA 14 days ago $117,000–$209,330

AWS Kubernetes Terraform Python Linux Bash Docker CI/CD Jenkins Git CloudWatch Splunk Dynatrace New Relic Grafana PostgreSQL MySQL MSSQL EC2 ECS EKS Lambda ELB S3 IAM VPC DynamoDB RDS

Save

Staff Site Reliability Engineer, Core IDaaS w/ active TS/SCI | Okta

Okta Inc

DC 128 days ago $188,000–$258,500

AWS Terraform Helm Go Kubernetes FedRAMP Impact Level 6 (IL6) Snowflake Redshift Databricks CI/CD Security Clearance Site Reliability Engineering

Save

Staff Site Reliability Engineer - Kubernetes | Okta

Okta Inc

Bellevue, WA 73 days ago $194,000–$267,000

Kubernetes AWS Helm Terraform CI/CD Istio Prometheus Grafana CloudWatch Python Bash Go Jenkins GitLab Ansible Spinnaker Docker RDS S3 VPC IAM EC2

Save

Principal Site Reliability Engineer - Observability and Telemetry Platform

Nvidia

Remote (Santa Clara, CA) 12 days ago $248,000–$396,750

Kubernetes Python Go Docker Grafana OpenTelemetry Prometheus Linux Networking Containers CI/CD Terraform AWS Azure Google Cloud Platform PostgreSQL MySQL Ansible SaltStack Bash Git Jenkins

Remote

Save