Member of Technical Staff, High Performance Computing Engineer - MAI SuperIntelligence Team | Microsoft Careers

Microsoft

Hybrid

Quick summary

Work type: Hybrid
Location: Mountain View, CA
Salary: $139,900–$274,800 / yr
Posted: 113 days ago
Closes: Aug 10, 2026
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $193k

This role $207k

$124k most similar roles pay here $291k

This role pays more than 68% of similar roles. Most pay $165,000–$221,725 — the shaded band above. At the midpoint, this role pays about $207k versus about $193k for comparable roles.

Based on 240 similar postings.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 571 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 522 roles with salary data.

Most-posted roles

View all roles at Microsoft

At a glance

TL;DR · Member of Technical Staff, High Performance Computing Engineer - MAI SuperIntelligence Team | Microsoft Careers

Apply Now Log in to save

Microsoft AI seeks a Member of Technical Staff at the High Performance Computing Engineer level to join its cutting-edge team responsible for scaling infrastructure that trains advanced models like Copilot. This role involves designing, operating, and maintaining large-scale HPC environments, managing schedulers such as SLURM and Kubernetes, and ensuring efficient job scheduling on massive clusters. Engineers will develop automation tools using Bash or Python, collaborate with researchers to support complex workloads, and troubleshoot issues independently in a fast-paced environment. The ideal candidate has extensive experience deploying high-performance clusters on-premise or in the cloud, working with frameworks like SLURM and Kubernetes, and building scalable services on public clouds such as Azure, AWS, or GCP.

Skills

Kubernetes SLURM Python Bash AWS Azure GCP nvidia InfiniBand Ray LLM training clusters NVIDIA H100/GB200

What you'll do

Design and maintain large-scale HPC environments for training advanced AI models.
Ensure reliable job scheduling by deploying and configuring HPC schedulers at scale.
Serve as a technical expert for core HPC domains like GPU compute or high-performance storage.
Develop automation tools using Bash and Python to enhance cluster reliability and efficiency.
Troubleshoot complex issues related to cluster usage and performance tuning of massive clusters.

What we're looking for

Bachelor’s degree in computer science or related field and 4+ years of experience in deploying or operating high-performance clusters.
Extensive hands-on experience working with high-scale training clusters using tools like SLURM, Kubernetes, Ray, and NVIDIA InfiniBand clusters.
Proven track record of building scalable services on public cloud infrastructure such as Azure, AWS, or GCP.
Experience designing, maintaining, and troubleshooting large-scale HPC environments in production settings.
Strong automation skills with Bash and/or Python for improving cluster reliability and operational efficiency.

Similar roles

Member of Technical Staff, Compute Orchestration & Scheduling - MAI Superintelligence Team | Microsoft Careers

Microsoft

US 177 days ago $139,900–$274,800

Kubernetes Ray Python CI/CD Docker Prometheus Grafana AWS Azure Git PostgreSQL MLOps RHEL Ubuntu

Hybrid

Save

Member of Technical Staff, Capacity & Efficiency Infrastructure - MAI Superintelligence Team | Microsoft Careers

Microsoft

Mountain View, CA 75 days ago $119,800–$234,700

Python C++ CUDA PyTorch JAX NCCL InfiniBand NVLink Distributed_training_parallelism GPU_architectures Profiling_and_benchmarking Telemetry_systems High_performance_computing Large_scale_AI_infrastructure

Hybrid

Save

Member of Technical Staff, Software Co-Design AI HPC Systems - MAI Superintelligence Team | Microsoft Careers

Microsoft

US 111 days ago $139,900–$274,800

Python C/C++ CUDA Distributed Systems HPC ML Systems Runtimes Compilers Performance Modeling Benchmarking Systems Analysis Hardware-Silicon Co-Design AI Accelerators GPU Architectures NCCL MPI RDMA InfiniBand CI/CD

Hybrid

Save

Member of Technical Staff - Data Research Engineer - MAI Superintelligence Team | Microsoft Careers

Microsoft

US 177 days ago $119,800–$234,700

Python Pandas NumPy Spark Ray Apache_Beam SQL CI/CD Git Jupyter_Notebook TensorFlow PyTorch PostgreSQL MongoDB Docker Kubernetes AWS Google_Cloud_Platform Azure GitHub

Hybrid

Save

Member of Technical Staff, Site Reliability Engineer (HPC) - MAI SuperIntelligence Team | Microsoft Careers

Microsoft

Mountain View, CA 106 days ago $139,900–$274,800

Kubernetes Docker CI/CD AWS Azure GCP Terraform Python Go Bash Grafana Datadog OpenTelemetry Networking Storage GPU High-Performance Computing(HPC) Capacity Planning Cost Optimization

Hybrid

Save

Member of Technical Staff - Software Engineer (SuperIntelligence team) | Microsoft Careers

Microsoft

US 146 days ago $119,800–$234,700

Python Kubernetes Azure Terraform Helm Prometheus Grafana OpenTelemetry Docker CI/CD Airflow Argo

Save