Careers
Quick summary
- Work type
- On-site
- Location
- San Diego, CA
- Posted
- 56 days ago
- Nearby
- 99+ roles within 25 mi
Market check
Salary context
How this pay compares to similar roles
This listing doesn't post a salary. Most similar roles pay $152,075–$208,800.
Based on 239 similar postings.
Employer
About Qualcomm
Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.
Qualcomm currently has 660 open roles on FindRole.
Listed pay typically runs $154,000–$231,000 across 429 roles with salary data.
Most-posted roles
- Careers 221
- GPU Software Engineer 3
- Sr Wireless Systems Engineer 3
- Datacenter Software Program Manager 2
- Embedded NPU Software Engineer, Senior 2
At a glance
TL;DR · Careers
As a Staff MLOps Engineer, you will join our team to architect, deploy, and optimize an ML platform supporting model training on NVIDIA DGX clusters and Kubernetes, both on-premises and on AWS Cloud. Your responsibilities include designing scalable infrastructure solutions, collaborating with data scientists and engineers to integrate workflows, and implementing CI/CD pipelines using ArgoCD and Argo Workflow. You will also maintain monitoring stacks with Prometheus and Grafana, manage AWS services like EKS and S3, and ensure the platform’s performance through logging and troubleshooting. Ideal candidates have a strong background in MLOps, Kubernetes, and GPU clusters, along with expertise in Python, Go, TensorFlow, and PyTorch. Experience with AWS services, CI/CD pipelines, and ML-specific data storage is highly valued.
Skills
What you'll do
- Architect, develop, and maintain the ML platform for training and inference of models.
- Design scalable infrastructure solutions for NVIDIA clusters both on premises and AWS Cloud.
- Optimize platform performance by enhancing GPU resource utilization and data ingestion processes.
- Implement CI/CD pipelines using ArgoCD and Argo Workflow for automated model deployment.
- Maintain monitoring stack with Prometheus and Grafana to ensure platform health and performance.
- Manage AWS services including EKS, EC2, VPC, IAM, S3, and EFS to support ML infrastructure.
What we're looking for
- Proven experience as an MLOps Engineer or similar role, focusing on large-scale ML and GPU clusters.
- Strong expertise in Kubernetes, Helm, ArgoCD, Argo Workflow, Prometheus, and Grafana.
- Proficient programming skills in Python, Go, with experience in TensorFlow and PyTorch.
- In-depth understanding of distributed computing, parallel computing, and GPU acceleration techniques.
- Solid experience with AWS services including EKS, EC2, VPC, IAM, S3, and EFS.
Related searches
More like this
Similar roles
Careers
Qualcomm