Spark Product Manager

IBM

Quick summary

Work type
On-site
Location
Austin, TX
Posted
41 days ago

Market check

Salary context

How this pay compares to similar roles

Similar $191k
$147k most similar roles pay here $235k

This listing doesn't post a salary. Most similar roles pay $160,400–$222,000.

Based on 239 similar postings.

Employer

About IBM

IBM is a US-based global technology company providing hybrid cloud, AI, consulting, enterprise software, and IT infrastructure products and services.

IBM currently has 743 open roles on FindRole.

Listed pay typically runs $1,000,000–$1,000,000 across 8 roles with salary data.

Most-posted roles

View all roles at IBM

At a glance

TL;DR · Spark Product Manager

We are seeking a Senior Product Manager to lead critical initiatives within our Apache Spark engine team in Austin, TX. This role involves driving the roadmap for key components such as Catalyst optimizer and Structured Streaming, while also focusing on lakehouse integrations with Delta Lake, Iceberg, and Hudi. The ideal candidate will work closely with platform engineers, data engineering teams, and ML infrastructure to optimize cost efficiency through shuffle optimization and dynamic resource allocation. Proficiency in JVM performance tuning, Spark internals including DAG scheduler and Catalyst optimizer, and experience with open-source contributions are essential. Candidates should have a strong background in SQL-based platforms, lakehouse architecture, and observability for distributed systems, as well as familiarity with ML feature engineering pipelines.

What you'll do

  • Drive the roadmap for Apache Spark engine improvements including Catalyst optimizer and dynamic partition pruning.
  • Define integration strategies for Spark with Delta Lake, Iceberg, and Hudi for read/write optimization.
  • Lead investments in Structured Streaming features like watermarking and stateful processing at scale.
  • Partner with infrastructure teams to optimize shuffle operations and reduce query costs.
  • Ensure cross-functional alignment among data engineers, SREs, and ML teams on Spark engine investments.

What we're looking for

  • Extensive experience driving Apache Spark engine roadmap and optimizations.
  • Deep understanding of lakehouse integrations including Delta Lake, Iceberg, Hudi.
  • Expertise in Structured Streaming, stateful processing, and latency/cost tradeoffs.
  • Familiarity with JVM performance tuning and native acceleration techniques.
  • Working knowledge of Spark internals: DAG scheduler, Catalyst optimizer, Tungsten execution.
  • Proven track record contributing to Apache Spark or major Spark distributions.
  • Experience defining observability strategies for distributed job platforms.

More like this

Similar roles

Spark Product Manager

IBM

San Jose, CA 41 days ago
Apache_Spark Delta_Lake Iceberg Hudi SQL Kafka DAG_scheduler Catalyst_optimizer Tungsten_execution JVM_performance GC_pressure Observability CI/CD Open_Source_Contributor Spark_Connect ML_Feature_Engineering

Spark Data Engineer, Senior

Booz Allen Hamilton

Chantilly, VA 17 days ago $77,600$176,000
Spark PySpark Java Spark ETL AWS Kafka Docker Kubernetes Cassandra PostgreSQL

Product Manager

Broadcom

Plano, TX 143 days ago $104,100$166,500
mainframe Product Management Agile Methodology CI/CD Artificial Intelligence Machine Learning UX Design Sales Initiatives Customer Relationship Management Software Licensing B2B Business Model Go-To-Market Plan Project Management Stakeholder Management

Product Manager

Q2

Austin, Texas 131 days ago
Excel Pendo CI/CD Kubernetes AWS Terraform Python PostgreSQL Docker Prometheus Grafana GitLab Jira Confluence
Hybrid

Product Manager

Q2

Austin, Texas 1 day ago
Agile Scrum Kanban Jira Trello Confluence Python SQL JavaScript React Node.js AWS Azure Google Cloud Platform Docker Kubernetes CI/CD PostgreSQL MongoDB Git GitHub Swagger RESTful APIs
Hybrid

Product Manager

nCino

Us - North Carolina - Hq 17 days ago $81,600$138,700
AI Machine_Learning Agile CI/CD Python SQL PostgreSQL AWS Kubernetes Docker Terraform Prometheus Grafana