Data Solutions Engineer

Citi

Remote

Quick summary

Work type
Remote
Location
Irving, Texas
Salary
$107,120–$160,680 / yr
Posted
21 days ago

Market check

Salary context

Below market

How this pay compares to similar roles

Similar $177k
This role $134k
$94k most similar roles pay here $230k

This role pays less than 76% of similar roles. Most pay $138,375–$215,587 — the shaded band above. At the midpoint, this role pays about $134k versus about $177k for comparable roles.

Based on 240 similar postings.

Employer

About Citi

Citi is one of the world’s most trusted financial institutions, proudly serving millions of customers across the United States.

Citi currently has 391 open roles on FindRole.

Listed pay typically runs $125,760–$188,640 across 361 roles with salary data.

Most-posted roles

View all roles at Citi

At a glance

TL;DR · Data Solutions Engineer

As a Data Solutions Engineer at an intermediate level within the Technology team, you will play a crucial role in designing and developing Big Data solutions by partnering with domain experts to create robust pipelines in Hadoop or Snowflake environments. Your daily tasks include delivering data-as-a-service frameworks, migrating legacy workloads to cloud platforms, and collaborating with stakeholders to document requirements and specifications. You will also mentor team members on Big Data and Cloud technology stacks while ensuring the implementation of consistent patterns and coding standards for all processes. The ideal candidate has 5+ years of experience with Hadoop and Big Data technologies, proficiency in Python, PySpark, Scala, and SAS, and familiarity with containerization tools like Docker and Kubernetes. Additionally, knowledge of Agile methodologies and strong development skills are essential as you work on optimizing applications for peak performance and evaluating new IT developments to enhance current systems.

What you'll do

  • Design and develop Big Data solutions for the Data Engineering team.
  • Develop robust Big Data pipelines in Hadoop or Snowflake environments with domain experts.
  • Lead migration of legacy workloads to cloud platforms like Google Cloud or AWS.
  • Engage stakeholders to document requirements and data flow specifications accurately.
  • Optimize Big Data applications on both Hadoop and non-Hadoop platforms for performance.
  • Convert SAS-based pipelines into modern languages like PySpark and Scala for execution.

What we're looking for

  • 5+ years of experience with Hadoop and Big Data technologies.
  • Proficiency in Python, PySpark, Scala, and fundamental machine learning libraries.
  • Experience developing data solutions on Google Cloud or AWS platforms.
  • Expertise in the Hadoop ecosystem including HDFS, MapReduce, Hive, Pig, Impala, Kafka, Kudu, Solr.
  • Strong development skills with a system-level understanding of distributed storage and compute.

More like this

Similar roles

Data Engineer

Equifax

Georgia 5 days ago
Python SQL Google Cloud AWS Snowflake Agile JIRA SNOW Terraform CI/CD Docker Prometheus Grafana
Hybrid

Data Engineer

Booz Allen Hamilton

Chantilly, VA 39 days ago $77,600$176,000
AWS RDS Aurora NiFi Python PostgreSQL Kafka SQL Kubernetes Helm ArgoCD Grafana Prometheus Elasticsearch

Data Engineer

Apple Inc

Cupertino, CA 11 days ago $126,800$220,900
Apache_Kafka Spark Flink Kubernetes Docker Java Python SQL Trino MCP CI/CD Generative_AI RAG Machine_Learning Dimensional_Modeling AWS GCP Azure PostgreSQL Hadoop Terraform Git Jenkins

Data Engineer

Dow

Remote (Midland, MI) 5 days ago
PySpark Azure Databricks Azure DevOps Git Terraform Delta Live Tables Great Expectations Power BI Tableau SQL Server CosmosDB Neo4j CI/CD Airflow Azure Data Factory
Remote

Data Engineer

Booz Allen Hamilton

Honolulu, HI 13 days ago $77,600$176,000
Palantir Foundry Pipeline Builder TypeScript Python Git CI/CD Agile SharePoint Dataverse Microsoft Power Platform DevSecOps software lifecycle methodologies

Data Engineer

Booz Allen Hamilton

San Diego, CA 3 days ago $61,900$141,000
Python AWS Git SQL Terraform Apache Spark AWS EMR Redshift SageMaker Databricks Linux CI/CD CloudFormation