Data Engineer (Databricks), Assistant Vice President

State Street

Actively hiring
Quincy, MA · Boston, MA · Princeton, NJ · Clifton, NJ Posted 14 days ago $110,000$177,500 / year

At a glance

AI generated

TL;DR

As a Data Engineer at State Street's Corporate Functions Technology team, you will design and build a modern Legal Data Lakehouse platform on AWS and Databricks, focusing on developing scalable data pipelines and enabling trusted, governed data capabilities for legal operations, compliance analytics, reporting, and AI/ML use cases. You will work with PySpark, Python, Spark SQL, and Delta Lake to develop ETL/ELT workflows, manage Databricks Jobs, Workflows, and Notebooks, and integrate data from various sources including SQL Server and Oracle. Additionally, you will collaborate with Legal, Security, Compliance, and Enterprise Data teams to deliver robust solutions while ensuring adherence to regulatory requirements and internal security standards. This role requires hands-on experience in Databricks, AWS data platforms, and enterprise data engineering practices, along with strong analytical skills and the ability to work effectively in a fast-paced environment.

Skills

Databricks AWS PySpark Python Delta Lake CI/CD SQL Power Platform APIs Docker Hadoop S3 Glue Lambda IAM KMS Unity Catalog Power BI Power Apps Terraform PostgreSQL Oracle JSON Parquet

What you'll do

  • Design and build scalable data pipelines using PySpark, Python, and Spark SQL.
  • Develop and optimize ETL/ELT workflows on Databricks with Delta Lake.
  • Implement Lakehouse architecture (Bronze/Silver/Gold layers) for enterprise data platforms.
  • Enable data governance frameworks using Databricks Unity Catalog and AWS controls.
  • Maintain clear documentation including architecture, data flows, and runbooks.

What we're looking for

  • 8+ years of experience in Data Engineering or data platform development.
  • Strong hands-on expertise with Databricks and Apache Spark.
  • Proficiency in PySpark, Python, and SQL.
  • Experience with AWS data platform services including S3, Glue, Lambda, IAM.
  • Solid understanding of distributed data processing, ETL/ELT frameworks, and data modeling techniques.
  • Hands-on experience building end-to-end data pipelines using Databricks and AWS.
  • Familiarity with Delta Lake and lakehouse architecture.

Market check

Salary context

This $110,000–$177,500 range sits above 34% of similar postings on FindRole.

Peer median band

$120,000$202,500

Median floor and ceiling across peers.

Typical midpoint (25–75%)

$132,300$193,207

Middle half of comparable postings.

Based on 240 comparable postings.

* 240 is the maximum number of comparable postings sampled.

Employer

About State Street

State Street Corporation is one of the world''s largest custodian banks and asset managers, providing investment servicing, investment management, and investment research to institutional investors. Industry: Financial Services & Asset Custody

State Street currently has 133 open roles on FindRole.

Listed pay typically runs $110,000–$180,000 across 131 roles with salary data.

Most-posted roles

View all roles at State Street

More like this

Similar roles

Databricks Tech Lead - Vice President

Citi

Remote (3800 Citigroup Center Drive Building F Tampa, US) 11 days ago $113,840$170,760
AWS Databricks Python SQL Terraform CloudFormation CI/CD S3 Glue Athena SQS Lambda Delta Lake Spark SQL
Remote

Senior Data Engineer - Vice President

Citi

Remote (6400 Las Colinas Blvd Irving, US) 18 days ago $125,760$188,640
Python PySpark Databricks Snowflake Starburst Trino Apache Iceberg AWS Agile Kubernetes Docker CI/CD Prometheus Grafana
Remote

Senior Data Engineer - Vice President

Citi

Remote (6400 Las Colinas Blvd Irving, US) 18 days ago $125,760$188,640
Python PySpark Databricks Snowflake Starburst Trino Apache Iceberg AWS Agile Kubernetes Docker CI/CD Prometheus Grafana
Remote

Sr. Data Engineer - Assistant Vice President

Citi

Remote (6460 Las Colinas Blvd Irving, US) 10 days ago
Hadoop Spark Kafka Hive Parquet Avro Python Scala Java Databricks Microservices AI ML Deep Learning NLP SQL Docker Kubernetes Data Mesh Starburst
Remote

SR. Data Engineer - Assistant Vice President

Citi

Remote (6460 Las Colinas Blvd Irving, US) 10 days ago
Hadoop Spark Kafka Hive Python Scala Java Databricks ETL ELT Microservices AI ML DeepLearning NLP SQL Docker Kubernetes AWS Azure GCP DataMesh Starburst
Remote