Senior Site Reliability Engineer

Anduril Industries

Quick summary

Work type
On-site
Location
Costa Mesa, CA
Salary
$166,000–$220,000 / yr
Posted
today

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $166k
This role $193k
$120k most similar roles pay here $231k

This role pays more than 78% of similar roles. Most pay $139,100–$193,000 — the shaded band above. At the midpoint, this role pays about $193k versus about $166k for comparable roles.

Based on 239 similar postings.

Employer

About Anduril Industries

Anduril Industries is a defense technology company that builds advanced hardware and software systems for national security, including autonomous drones, surveillance systems, and the Lattice AI command platform.

Anduril Industries currently has 1882 open roles on FindRole.

Listed pay typically runs $146,000–$194,000 across 1696 roles with salary data.

Most-posted roles

View all roles at Anduril Industries

At a glance

TL;DR · Senior Site Reliability Engineer

As a Senior Site Reliability Engineer on the Mission Autonomy team, you will play a pivotal role in ensuring the reliability and performance of our autonomous systems, which are critical for national security. Your responsibilities include managing specialized infrastructure, designing fault-tolerant systems, identifying and resolving performance bottlenecks, implementing comprehensive monitoring solutions, automating operational tasks, and collaborating with security teams to integrate best practices. You will work closely with development teams to bridge the gap between DevOps and operations, ensuring seamless integration of software delivery tools and supporting broader Anduril systems. The role requires expertise in Linux, containerization technologies like Docker and Kubernetes, automation tools such as Ansible and Terraform, and a solid understanding of networking fundamentals. Experience in defense or aerospace industries is preferred, along with familiarity with cloud platforms and secure coding practices.

What you'll do

  • Manage and expand specialized on-site infrastructure for developer servers and HITL systems.
  • Design, implement, and maintain highly available fault-tolerant autonomous systems.
  • Identify and eliminate performance bottlenecks to ensure low-latency operations.
  • Develop comprehensive monitoring solutions for system health and behavior insights.
  • Automate operational tasks from provisioning to testing and recovery processes.
  • Integrate security best practices into operational processes and infrastructure.

What we're looking for

  • 5+ years of experience in Site Reliability Engineering or a similar role focused on security for mission-critical applications.
  • Strong proficiency in modern programming languages such as Python and Go, with deep expertise in Linux operating systems.
  • Experience with automation tools like Ansible, Puppet, and Terraform, along with containerization technologies (Docker) and orchestration platforms (Kubernetes).
  • Solid understanding of networking fundamentals including TCP/IP, DNS, HTTP, and load balancing.
  • Excellent analytical, problem-solving, and debugging skills for complex system issues.
  • Active U.S. Security Clearance required.
  • Strong communication skills and ability to work effectively in cross-functional teams.

More like this

Similar roles

Senior Site Reliability Engineer

CoStar Group

Arlington, VA 17 days ago
AWS Kubernetes Docker Terraform CloudFormation Python Java C# NodeJS Bash PCI compliance REST API Microservices CDN PostgreSQL MySQL Azure Google Cloud CI/CD
Hybrid

Senior Site Reliability Engineer

Adobe

San Jose 57 days ago $208,300$301,600
AWS Kubernetes Terraform Python Go CI/CD Infrastructure as Code Docker PostgreSQL Security hardening AI-enabled platforms Cross-team leadership Developer experience optimization

Senior Site Reliability Engineer

Carta

San Francisco, California 61 days ago $181,688$213,750
AWS Terraform Python Kubernetes Docker Postgres Prometheus Grafana CI/CD gRPC Ansible ELK Stack Datadog GraphQL
Hybrid

Senior Site Reliability Engineer

Oracle

Reston, Virginia 27 days ago
Oracle Linux Ansible Terraform Python Bash Prometheus Grafana Kubernetes CI/CD Git Active Directory LDAP Kerberos GlusterFS PostgreSQL Docker AWS Azure Google Cloud Platform Nginx Apache HTTP Server