Careers

Qualcomm

Quick summary

Work type: On-site
Location: San Diego, CA
Posted: 56 days ago
Nearby: 99+ roles within 25 mi

Market check

Salary context

How this pay compares to similar roles

Similar $180k

$118k most similar roles pay here $247k

This listing doesn't post a salary. Most similar roles pay $152,075–$208,800.

Based on 239 similar postings.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 660 open roles on FindRole.

Listed pay typically runs $154,000–$231,000 across 429 roles with salary data.

Most-posted roles

View all roles at Qualcomm

At a glance

TL;DR · Careers

Apply Now Log in to save

As a Staff MLOps Engineer, you will join our team to architect, deploy, and optimize an ML platform supporting model training on NVIDIA DGX clusters and Kubernetes, both on-premises and on AWS Cloud. Your responsibilities include designing scalable infrastructure solutions, collaborating with data scientists and engineers to integrate workflows, and implementing CI/CD pipelines using ArgoCD and Argo Workflow. You will also maintain monitoring stacks with Prometheus and Grafana, manage AWS services like EKS and S3, and ensure the platform’s performance through logging and troubleshooting. Ideal candidates have a strong background in MLOps, Kubernetes, and GPU clusters, along with expertise in Python, Go, TensorFlow, and PyTorch. Experience with AWS services, CI/CD pipelines, and ML-specific data storage is highly valued.

Skills

Kubernetes AWS EKS Python Go Prometheus Grafana CI/CD ArgoCD Docker Helm NVIDIA DGX TensorFlow PyTorch VPC IAM S3 EFS CloudWatch

What you'll do

Architect, develop, and maintain the ML platform for training and inference of models.
Design scalable infrastructure solutions for NVIDIA clusters both on premises and AWS Cloud.
Optimize platform performance by enhancing GPU resource utilization and data ingestion processes.
Implement CI/CD pipelines using ArgoCD and Argo Workflow for automated model deployment.
Maintain monitoring stack with Prometheus and Grafana to ensure platform health and performance.
Manage AWS services including EKS, EC2, VPC, IAM, S3, and EFS to support ML infrastructure.

What we're looking for

Proven experience as an MLOps Engineer or similar role, focusing on large-scale ML and GPU clusters.
Strong expertise in Kubernetes, Helm, ArgoCD, Argo Workflow, Prometheus, and Grafana.
Proficient programming skills in Python, Go, with experience in TensorFlow and PyTorch.
In-depth understanding of distributed computing, parallel computing, and GPU acceleration techniques.
Solid experience with AWS services including EKS, EC2, VPC, IAM, S3, and EFS.

Similar roles

Careers

Qualcomm

US 28 days ago

Python PyTorch TensorFlow Keras AWS S3 Glue EMR Docker Kubernetes Kafka RabbitMQ Spark Databricks Delta Lake Iceberg Hudi SQL Postgres Prometheus Grafana Datadog Splunk CI/CD MLOps

Save

Careers

Qualcomm

US 56 days ago

C Assembly RTOS OS Kernel Zephyr eCos uC/OS FreeRTOS ARM v8 Simulators FPGA Emulation Python

Save

Careers

Qualcomm

San Diego, CA +2 74 days ago

ComputerArchitecture MemorySystems RAS ECC Encryption DRAM LPDDR HBM DDRx GDDR PIM ProcessingNearMemory 3DIC ChipletArchitectures Interconnects DieToDieProtocols DataCenterRequirements QuantitativeAnalysisTools HighLevelCalculators Spreadsheets ProfilingTools FunctionalSimulators PerformanceSimulators

Save

Careers

Qualcomm

San Diego, CA +1 63 days ago

EAR ITAR OFAC CCATS SAP GTS Amber Road AI machine learning cloud computing IaaS PaaS SaaS ERP systems encryption semiconductor design data centers fabrication processes wireless communications high performance computing

Save

Careers

Qualcomm

San Diego, CA 51 days ago

Python Django Celery RabbitMQ PostgreSQL Git Linux Shell scripting SQL Redux RTK Docker Kubernetes Apache Airflow Prefect React SPDX CycloneDX FOSSID ScanCode-toolkit

Save

Careers

Qualcomm

San Diego, CA +1 22 days ago

Apigee Kubernetes OAuth2 JWT mTLS OpenAPI Swagger AWS GCP GitHub Actions REST CI/CD YAML Helm Python JavaScript Terraform PostgreSQL Redis MongoDB Docker Git Jenkins

Save