Principal SW Engineer - LLM Serving (Cloud AI)

Qualcomm

Quick summary

Work type: On-site
Location: San Diego, CA
Salary: $200,800–$301,200 / yr
Posted: 107 days ago
Closes: Aug 17, 2026
Nearby: 99+ roles within 25 mi

Market check

Salary context

Above market

How this pay compares to similar roles

Similar $204k

This role $251k

$145k most similar roles pay here $318k

This role pays more than 82% of similar roles. Most pay $162,000–$246,150 — the shaded band above. At the midpoint, this role pays about $251k versus about $204k for comparable roles.

Based on 239 similar postings.

Employer

About Qualcomm

Qualcomm is a leading American semiconductor and telecommunications company based in San Diego, CA.

Qualcomm currently has 621 open roles on FindRole.

Listed pay typically runs $148,300–$224,400 across 562 roles with salary data.

Most-posted roles

View all roles at Qualcomm

At a glance

TL;DR · Principal SW Engineer - LLM Serving (Cloud AI)

Apply Now Log in to save

The Qualcomm Cloud AI team is hiring an experienced engineer to contribute to the development of software solutions for inference acceleration. This role involves working across the entire product lifecycle from R&D to commercial deployment, requiring strong communication and cross-functional collaboration skills. The ideal candidate will have a track record of delivering large-scale commercial software projects and experience with frameworks like vLLM. Key responsibilities include designing, compiling, and optimizing neural networks for multicore systems, as well as performance modeling of SoC architectures. Proficiency in PyTorch, C++, Python, and understanding multi-core processor architecture is essential, along with expertise in LLMs, multi-modal models, and reasoning models. This position demands a deep knowledge of machine learning accelerators and neural network operators, making it ideal for those passionate about advancing AI inference technology at scale.

Skills

PyTorch Python C++ LLMs Multi-modal models Reasoning models Neural networks High performance software Multicore systems Performance analysis Multi-core architecture SoC architectures Performance modeling Machine learning accelerators Neural network operators Linear algebra Math libraries

What you'll do

Plan and manage the delivery of large commercial software projects.
Execute, analyze, and optimize neural networks for performance.
Develop high-performance software for multicore systems using C++ and Python.
Analyze performance of software/hardware solutions on multi-core architectures.
Write efficient code for machine learning accelerators and related software.

What we're looking for

Proven ability to plan, manage, and deliver large commercial software projects.
Strong development skills in PyTorch and experience with frameworks like vLLM.
Experience executing, analyzing, and optimizing neural networks on multi-core systems.
Understanding of multi-core processor architecture and SoC fundamentals (NoCs, caches).
Excellent communication skills for cross-functional team interaction.
Background in machine learning accelerators and related software optimization.
Strong performance analysis skills for software/hardware solutions on multi-core architectures.

Similar roles

LLM Serving Engineer (Cloud AI Engineering), Senior / Staff Engineer

Qualcomm

San Diego, CA 57 days ago $158,400–$237,600

Triton-Inference Server vLLM SGLang PyTorch Python Kubernetes Docker CI/CD Prometheus Grafana PostgreSQL Redis OpenAI Hugging Face AWS Google Cloud Platform Azure Git Jenkins GitHub Slack

Save

AI Performance Engineer (Cloud AI Engineering), Sr | Staff | Sr. Staff

Qualcomm

San Diego, CA 57 days ago $178,400–$267,600

PyTorch ONNX Python Transformer architectures Attention mechanisms Sharding strategies Parallelism techniques Computer architecture ML accelerators Distributed systems Linear algebra Math libraries Machine learning compilers torch.compile torchDynamo

Save

Senior Lead AI Engineer (LLM Gateway, FM Hosting)

Capital One Financial

McLean, VA 23 days ago $229,900–$262,400

Python TensorFlow PyTorch Docker Kubernetes AWS CI/CD Git PostgreSQL Redis Scikit-learn Flask RESTful APIs Nginx Prometheus Grafana

Save

Principal Data Engineer, LLM/AI Platforms (Remote)

CrowdStrike

Remote (Usa Tx Remote, US) 7 days ago $195,000–$290,000

Python AWS GCP Kubernetes Docker Spark MLflow Sagemaker Vertex AI LangChain LlamaIndex Snowflake BigQuery Airflow Kafka Pulsar MLOps CI/CD

Remote

Save

AI Platform Principal Engineer (Google Cloud Platform)

The Hartford

Hartford, CT 71 days ago $168,400–$252,600

Python Terraform Google Cloud Platform BigQuery Cloud Functions AI Platform API Gateway GKE Docker CI/CD Agile NoSQL ETL Chatbots HAI IR Vector Embedding Hybrid Search LLM Orchestration Langchain Foundational NLP Deep Learning

Hybrid

Save

Senior Lead AI Engineer (LLM Customization and Finetuning)

Capital One Financial

Cambridge, MA 122 days ago $229,900–$262,400

Python TensorFlow PyTorch Kubernetes Docker AWS Azure GCP CI/CD Git PostgreSQL MongoDB Redis Scikit-learn Pandas NumPy Jupyter Swagger GraphQL

Save