Principal SW Engineer - LLM Serving (Cloud AI)

Qualcomm

Quick summary

Work type
On-site
Location
San Diego, CA
Salary
$200,800–$301,200 / yr
Posted
107 days ago
Closes
Aug 17, 2026

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $204k
This role $251k
$145k most similar roles pay here $318k

This role pays more than 82% of similar roles. Most pay $162,000–$246,150 — the shaded band above. At the midpoint, this role pays about $251k versus about $204k for comparable roles.

Based on 239 similar postings.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 621 open roles on FindRole.

Listed pay typically runs $148,300–$224,400 across 562 roles with salary data.

Most-posted roles

View all roles at Qualcomm

At a glance

TL;DR · Principal SW Engineer - LLM Serving (Cloud AI)

The Qualcomm Cloud AI team is hiring an experienced engineer to contribute to the development of software solutions for inference acceleration. This role involves working across the entire product lifecycle from R&D to commercial deployment, requiring strong communication and cross-functional collaboration skills. The ideal candidate will have a track record of delivering large-scale commercial software projects and experience with frameworks like vLLM. Key responsibilities include designing, compiling, and optimizing neural networks for multicore systems, as well as performance modeling of SoC architectures. Proficiency in PyTorch, C++, Python, and understanding multi-core processor architecture is essential, along with expertise in LLMs, multi-modal models, and reasoning models. This position demands a deep knowledge of machine learning accelerators and neural network operators, making it ideal for those passionate about advancing AI inference technology at scale.

What you'll do

  • Plan and manage the delivery of large commercial software projects.
  • Execute, analyze, and optimize neural networks for performance.
  • Develop high-performance software for multicore systems using C++ and Python.
  • Analyze performance of software/hardware solutions on multi-core architectures.
  • Write efficient code for machine learning accelerators and related software.

What we're looking for

  • Proven ability to plan, manage, and deliver large commercial software projects.
  • Strong development skills in PyTorch and experience with frameworks like vLLM.
  • Experience executing, analyzing, and optimizing neural networks on multi-core systems.
  • Understanding of multi-core processor architecture and SoC fundamentals (NoCs, caches).
  • Excellent communication skills for cross-functional team interaction.
  • Background in machine learning accelerators and related software optimization.
  • Strong performance analysis skills for software/hardware solutions on multi-core architectures.

More like this

Similar roles

AI Performance Engineer (Cloud AI Engineering), Sr | Staff | Sr. Staff

Qualcomm

San Diego, CA 57 days ago $178,400$267,600
PyTorch ONNX Python Transformer architectures Attention mechanisms Sharding strategies Parallelism techniques Computer architecture ML accelerators Distributed systems Linear algebra Math libraries Machine learning compilers torch.compile torchDynamo

Principal Data Engineer, LLM/AI Platforms (Remote)

CrowdStrike

Remote (Usa Tx Remote, US) 7 days ago $195,000$290,000
Python AWS GCP Kubernetes Docker Spark MLflow Sagemaker Vertex AI LangChain LlamaIndex Snowflake BigQuery Airflow Kafka Pulsar MLOps CI/CD
Remote

AI Platform Principal Engineer (Google Cloud Platform)

The Hartford

Hartford, CT 71 days ago $168,400$252,600
Python Terraform Google Cloud Platform BigQuery Cloud Functions AI Platform API Gateway GKE Docker CI/CD Agile NoSQL ETL Chatbots HAI IR Vector Embedding Hybrid Search LLM Orchestration Langchain Foundational NLP Deep Learning
Hybrid