Sr Software Engineer, AI Tools – On-Device Generative AI Model Optimization

Qualcomm

Quick summary

Work type: On-site
Location: San Diego, CARaleigh, NC
Salary: $140,800–$211,200 / yr
Posted: 3 days ago
Closes: Dec 12, 2026
Nearby: 99+ roles within 25 mi

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $199k

This role $176k

$129k most similar roles pay here $254k

This role pays less than 66% of similar roles. Most pay $162,562–$236,037 — the shaded band above. At the midpoint, this role pays about $176k versus about $199k for comparable roles.

Based on 240 similar postings.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 757 open roles on FindRole.

Listed pay typically runs $151,900–$229,800 across 444 roles with salary data.

Most-posted roles

View all roles at Qualcomm

At a glance

TL;DR · Sr Software Engineer, AI Tools – On-Device Generative AI Model Optimization

Apply Now Log in to save

As a Machine Learning Engineer at Qualcomm Technologies in San Diego or Raleigh, you will join a dynamic team focused on advancing AI technologies for next-generation applications. Your role involves reauthoring generative AI architectures to optimize execution on Qualcomm hardware, integrating inference acceleration techniques, and enabling custom model deployments for OEMs. You’ll collaborate closely with compiler teams and quantization engineers to ensure architectural decisions align with performance constraints, while also contributing to a multi-stage model preparation pipeline that provides actionable diagnostics. The ideal candidate has 4+ years of experience in ML systems or related fields, proficiency in Python, and deep knowledge of generative AI architectures and PyTorch internals. Familiarity with tools like HuggingFace transformers and on-device runtimes is preferred, as you will work on complex optimization challenges that impact model accuracy and developer experience across various technology verticals.

Skills

Python PyTorch HuggingFace_transformers ONNXRuntime GitHub_Copilot CUDA LLMs multimodal_models inference_optimization memory-efficient_attention decode_acceleration quantization_engineering SoC_constraints NPU_DSP_execution CI/CD

What you'll do

Reauthor generative AI architectures for efficient execution on Qualcomm AI hardware.
Translate hardware constraints into model-level transformations to preserve accuracy.
Integrate inference acceleration techniques into the model preparation pipeline.
Develop reauthoring strategies for custom OEM models and customer-specific use cases.
Contribute reauthoring stages to a multi-stage model preparation pipeline.
Build developer-facing diagnostics providing clear, actionable feedback on model performance.
Partner with compiler teams to decide on graph-level optimizations or model reauthoring.

What we're looking for

Bachelor's degree in Computer Science/Engineering with 4+ years of ML engineering experience or equivalent.
Proficient in Python for large, typed codebases and experienced in ML systems optimization.
Strong understanding of generative AI architectures across LLMs and multimodal models.
Experience optimizing inference for edge deployments with measurable performance improvements.
Familiarity with PyTorch internals and HuggingFace transformers ecosystem.
Knowledge of on-device runtimes and SoC-level constraints, including QAIRT/QNN or similar tools.
Excellent written and verbal communication skills to collaborate across various technical teams.

Similar roles

Generative AI Engineer, AVP

Citi

Remote (Jacksonville, Florida) 63 days ago $96,960–$145,440

Python FastAPI Docker Kubernetes OpenShift Pandas NumPy Scikit-learn Hugging Face Transformers LangChain LlamaIndex PostgreSQL pgvector Pinecone Chroma PyTorch TensorFlow Prometheus Grafana MLOps CI/CD

Remote

Save

Senior Software Engineer, Generative AI Research

Nvidia

Santa Clara, CA 15 days ago $184,000–$287,500

Python C++ Go Rust Kubernetes Docker Terraform AWS CI/CD Prometheus Grafana PostgreSQL Redis Git Linux NVIDIA_Deep_Learning_Infrastructure Open_Source_Contributions

Save

Sr. Software Engineer - Applied AI

GEICO

Remote (Palo Alto, CA) 63 days ago $80,000–$215,000

Python LangChain HuggingFace OpenAI Kubernetes CI/CD Docker Prometheus Grafana PostgreSQL Redis Apache Kafka Spring AI LangGraph LangSmith LlamaIndex Anthropic APIs Vector databases Knowledge graphs Java Spring生态系统

Remote

Save