Staff Machine Learning Engineer – Model Optimization & Quantization

Qualcomm

Actively hiring
San Diego, CA · Santa Clara, CA Posted 82 days ago $158,400$237,600 / year

At a glance

AI generated

TL;DR

As a Staff Software Engineer at Qualcomm Technologies' AI Hub team, you will develop cutting-edge tools for optimizing and deploying machine learning models on edge devices, focusing on the AIMET open-source library. Your daily tasks include designing quantization algorithms, implementing advanced techniques like weight-only and sub-4-bit quantization, and integrating workflows with PyTorch and ONNX. You will also build developer-friendly APIs and tooling to ensure seamless model optimization for Qualcomm hardware, while maintaining automated pipelines and evaluation harnesses for large-scale deployment via AI Hub. The role requires expertise in Python, deep learning frameworks, and a solid understanding of neural network architectures, along with experience in model quantization techniques and ARM processors. You will collaborate across teams to drive technical alignment and influence the AIMET product roadmap and cross-BU strategy.

Skills

Python PyTorch ONNX AIMET Git CI/CD Prometheus Grafana C++ ARM Kubernetes Docker AWS TensorFlow PostgreSQL

What you'll do

  • Design and maintain quantization algorithms within AIMET for edge devices.
  • Implement advanced quantization techniques for large language models and generative AI.
  • Develop tooling to analyze and debug model accuracy issues caused by quantization.
  • Integrate AIMET workflows with PyTorch and ONNX frameworks.
  • Create APIs and developer tools to make AIMET accessible for external users.
  • Quantize and validate a variety of neural network architectures for deployment.
  • Develop automated pipelines for scaling model onboarding across AI Hub's catalog.

What we're looking for

  • Expertise in developing and maintaining model optimization workflows within AIMET.
  • Proficiency in Python with hands-on experience in PyTorch, ONNX, and TensorFlow.
  • Solid understanding of neural network architectures including CNNs, Transformers, LLMs.
  • Experience with advanced quantization techniques such as PTQ, QAT, weight-only quantization.
  • Familiarity with hardware constraints for model deployment on edge devices.
  • Strong written and verbal communication skills to engage with external developers.
  • Proficiency in git and software development best practices.

Market check

Salary context

This $158,400–$237,600 range sits above 45% of similar postings on FindRole.

Peer median band

$160,600$260,000

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$176,000$246,150

Middle half of comparable postings.

Based on 239 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 595 open roles on FindRole.

Listed pay typically runs $148,300–$222,500 across 540 roles with salary data.

Most-posted roles

View all roles at Qualcomm

More like this

Similar roles

Staff Machine Learning Engineer

Intuit

Mountain View, CA 49 days ago $202,500$274,000
Python Scikit-learn NLTK NumPy Pandas TensorFlow Keras R Spark SQL Git AWS GCP CI/CD

Staff Machine Learning Engineer

Arm Holdings

Austin, TX 49 days ago $249,900$338,100
Python TensorFlow PyTorch GPU ARM ML Model Optimization Deep Learning Computer Architecture CI/CD
Hybrid

Staff Machine Learning Engineer

Intuit

Mountain View, California 45 days ago $197,000$266,500
Python Scikit-learn NLTK NumPy Pandas TensorFlow Keras R Spark SQL Git AWS GCP CUDA cuDNN