Sr Software Engineer, AI Tools – On-Device Generative AI Model Optimization

Qualcomm

Quick summary

Work type
On-site
Location
San Diego, CARaleigh, NC
Salary
$140,800–$211,200 / yr
Posted
3 days ago
Closes
Dec 12, 2026

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $199k
This role $176k
$129k most similar roles pay here $254k

This role pays less than 66% of similar roles. Most pay $162,562–$236,037 — the shaded band above. At the midpoint, this role pays about $176k versus about $199k for comparable roles.

Based on 240 similar postings.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 757 open roles on FindRole.

Listed pay typically runs $151,900–$229,800 across 444 roles with salary data.

Most-posted roles

View all roles at Qualcomm

At a glance

TL;DR · Sr Software Engineer, AI Tools – On-Device Generative AI Model Optimization

As a Machine Learning Engineer at Qualcomm Technologies in San Diego or Raleigh, you will join a dynamic team focused on advancing AI technologies for next-generation applications. Your role involves reauthoring generative AI architectures to optimize execution on Qualcomm hardware, integrating inference acceleration techniques, and enabling custom model deployments for OEMs. You’ll collaborate closely with compiler teams and quantization engineers to ensure architectural decisions align with performance constraints, while also contributing to a multi-stage model preparation pipeline that provides actionable diagnostics. The ideal candidate has 4+ years of experience in ML systems or related fields, proficiency in Python, and deep knowledge of generative AI architectures and PyTorch internals. Familiarity with tools like HuggingFace transformers and on-device runtimes is preferred, as you will work on complex optimization challenges that impact model accuracy and developer experience across various technology verticals.

What you'll do

  • Reauthor generative AI architectures for efficient execution on Qualcomm AI hardware.
  • Translate hardware constraints into model-level transformations to preserve accuracy.
  • Integrate inference acceleration techniques into the model preparation pipeline.
  • Develop reauthoring strategies for custom OEM models and customer-specific use cases.
  • Contribute reauthoring stages to a multi-stage model preparation pipeline.
  • Build developer-facing diagnostics providing clear, actionable feedback on model performance.
  • Partner with compiler teams to decide on graph-level optimizations or model reauthoring.

What we're looking for

  • Bachelor's degree in Computer Science/Engineering with 4+ years of ML engineering experience or equivalent.
  • Proficient in Python for large, typed codebases and experienced in ML systems optimization.
  • Strong understanding of generative AI architectures across LLMs and multimodal models.
  • Experience optimizing inference for edge deployments with measurable performance improvements.
  • Familiarity with PyTorch internals and HuggingFace transformers ecosystem.
  • Knowledge of on-device runtimes and SoC-level constraints, including QAIRT/QNN or similar tools.
  • Excellent written and verbal communication skills to collaborate across various technical teams.

More like this

Similar roles

Generative AI Engineer, AVP

Citi

Remote (Jacksonville, Florida) 63 days ago $96,960$145,440
Python FastAPI Docker Kubernetes OpenShift Pandas NumPy Scikit-learn Hugging Face Transformers LangChain LlamaIndex PostgreSQL pgvector Pinecone Chroma PyTorch TensorFlow Prometheus Grafana MLOps CI/CD
Remote

Senior Software Engineer, Generative AI Research

Nvidia

Santa Clara, CA 15 days ago $184,000$287,500
Python C++ Go Rust Kubernetes Docker Terraform AWS CI/CD Prometheus Grafana PostgreSQL Redis Git Linux NVIDIA_Deep_Learning_Infrastructure Open_Source_Contributions

Sr. Software Engineer - Applied AI

GEICO

Remote (Palo Alto, CA) 63 days ago $80,000$215,000
Python LangChain HuggingFace OpenAI Kubernetes CI/CD Docker Prometheus Grafana PostgreSQL Redis Apache Kafka Spring AI LangGraph LangSmith LlamaIndex Anthropic APIs Vector databases Knowledge graphs Java Spring生态系统
Remote

Applied AI Engineer

Booz Allen Hamilton

Fort Belvoir, VA +1 34 days ago $99,000$225,000
Python FastAPI Flask Streamlit Gradio React TypeScript Kubernetes CI/CD Prometheus Grafana MLOps Docker PostgreSQL AWS Azure Google Cloud Platform

Applied AI Engineer

Ramp

Remote (New York City, New York, US) 157 days ago $155,000$339,500
Python JavaScript Node.js Django Flask React PostgreSQL MongoDB AWS GCP Kubernetes Terraform CI/CD GitOps
Remote