Site Reliability Engineer

SpaceX

Quick summary

Work type: On-site
Location: Hawthorne, CA
Salary: $145,000–$175,000 / yr
Posted: 7 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $181k

This role $160k

$133k most similar roles pay here $229k

This role pays less than 66% of similar roles. Most pay $146,900–$214,975 — the shaded band above. At the midpoint, this role pays about $160k versus about $181k for comparable roles.

Based on 240 similar postings.

Employer

About SpaceX

SpaceX designs, manufactures, and launches advanced rockets and spacecraft with the mission of enabling humans to become a multi-planetary species. It operates the Falcon 9, Falcon Heavy, and Starship launch vehicles, as well as the Starlink satellite internet constellation.

SpaceX currently has 641 open roles on FindRole.

Listed pay typically runs $130,000–$160,000 across 476 roles with salary data.

Most-posted roles

View all roles at SpaceX

At a glance

TL;DR · Site Reliability Engineer

Role Posting Log in to save

As a Site Reliability Engineer at the Classified IT Systems Engineering team, you will be responsible for designing and maintaining scalable systems that support a growing volume of data products. Your day-to-day tasks include building, maintaining, and scaling on-premises hardware systems designed to host GPU-accelerated machine learning workloads. You will leverage Kubernetes, Linux operating systems, Python, and virtualization technologies such as hypervisors to ensure high performance and reliability. The role requires a deep understanding of system administration, site reliability engineering, and DevOps practices, along with the ability to quickly learn new tools and frameworks. This position demands an active Top Secret clearance or higher due to the sensitive nature of the work, which involves pushing the boundaries in inferential model benchmarks and managing large-scale server environments.

Skills

Kubernetes Linux Python DevOps Site Reliability Engineering Virtualization Hypervisor technologies Performance optimization techniques

What you'll do

Build and maintain on-premises hardware systems for GPU-accelerated machine learning workloads.
Design scalable systems to support growing data product volumes.
Automate management of dozens or hundreds of servers in the infrastructure.
Implement performance improvement techniques to optimize system bottlenecks.
Collaborate on the development, testing, and operational support lifecycle of systems.

What we're looking for

Bachelor’s degree in computer science, information systems/IT, or engineering; or equivalent experience.
At least 1 year of experience with Kubernetes and Linux operating systems.
Experience building, maintaining, and scaling on-premises hardware systems for GPU-accelerated machine learning workloads.
Active Top Secret clearance is highly desired.
Knowledge of performance bottlenecks and techniques to improve system performance.