Data Engineer

CVS Health

Remote

Quick summary

Work type
Remote
Location
New York, NY
Posted
6 days ago
Closes
Jul 6, 2026

Market check

Salary context

How this pay compares to similar roles

Similar $163k
$110k most similar roles pay here $210k

This listing doesn't post a salary. Most similar roles pay $126,800–$199,050.

Based on 240 similar postings.

Employer

About CVS Health

CVS Health is a leading American healthcare company operating retail pharmacies, pharmacy benefit management services, and a health insurance segment through Aetna, one of the nation''s largest health insurers. Industry: Healthcare & Pharmacy

CVS Health currently has 407 open roles on FindRole.

Listed pay typically runs $118,450–$284,280 across 133 roles with salary data.

Most-posted roles

View all roles at CVS Health

At a glance

TL;DR · Data Engineer

As a Data Engineer at Caremark, LLC., a CVS Health company in New York, NY, you will join the Data Science team to develop and manage large-scale data structures and pipelines, focusing on ETL workflows to support business applications. Your daily tasks include writing efficient ETL processes, designing database systems, and developing tools for real-time and offline analytics. You will collaborate with the Data Science team to integrate algorithms into automated processes, test and maintain systems, and troubleshoot issues. Leveraging Google Cloud Platform (GCP) and AWS, you will build robust data pipelines using Python or Java, create data marts and models, and ensure adherence to data quality standards. With expertise in machine learning, statistical analysis, and NLP tools like Scikit-Learn and Pytorch, you will contribute to large-scale projects and advise on new technologies to optimize solutions for complex business problems.

What you'll do

  • Develop large-scale data structures and pipelines to organize and standardize data for business insights.
  • Write ETL processes and design database systems to improve existing analytic capabilities.
  • Collaborate with Data Science team to integrate algorithms into automated processes.
  • Test, maintain, and troubleshoot data systems using Google Cloud Platform tools.
  • Build robust data pipelines and dynamic systems using Python or Java programming skills.
  • Create data marts and models to support internal customers' needs and standards.

What we're looking for

  • Master’s degree in Computer Science or related field with relevant coursework
  • Experience with Java, Python, and cloud platforms like AWS and GCP
  • Proficiency in machine learning, statistical analysis, and predictive modeling
  • Knowledge of NLP tools such as Scikit-Learn, SpaCy, PyTorch, and Spark NLP
  • Expertise in developing large-scale data pipelines and ETL processes
  • Ability to collaborate with Data Science teams on automated processes
  • Strong quantitative analysis skills including clustering and regression techniques

More like this

Similar roles

Data Engineer

CVS Health

Remote (Irving, TX) 6 days ago
Python Java Google Cloud Platform Hibernate Oracle JIRA Rally Confluence ETL CI/CD SQL Docker Kubernetes PostgreSQL AWS Terraform
Remote

Data Engineer

CVS Health

Remote (Hartford, CT) 6 days ago
Python Java R Spark PySpark Scala MySQL NoSQL PowerBI Tableau NLP Scikit-learn Spacy PyTorch Spark NLP Vertex-AI GCP AWS Azure
Remote

Data Engineer

CVS Health

Remote (New York, NY) 6 days ago
AWS Azure GCP Python Java R Spark PySpark Scala SAS SQL Hadoop HDFS CI/CD Jenkins GIT Machine learning Statistical analysis Predictive modeling NLP Scikit-Learn Spacy Pytorch Spark NLP ETL Data warehousing Big Data Distributed computing
Remote

Machine Learning Engineer

CVS Health

Remote (Irving, TX) 6 days ago
Python Java Node.js NLP Scikit-Learn Spacy Pytorch Spark NLP Machine learning Statistical analysis Predictive modeling Feature engineering Distributed model training Supervised learning Unsupervised learning Clustering Regression Pattern recognition CI/CD
Remote

Data Scientist

CVS Health

Remote (New York, NY) 6 days ago
Python R SQL Google Cloud Platform CI/CD Scikit-Learn SpaCy PyTorch Spark NLP MySQL ETL Machine Learning NLP Predictive Modeling Statistical Analysis Feature Engineering Distributed Model Training Relational Database Concepts
Remote

Data Engineer

Booz Allen Hamilton

Albuquerque, NM 41 days ago $61,900$141,000
Python PostgreSQL AWS Docker Kubernetes Terraform CI/CD RESTful APIs PySpark CloudFormation CDK Data质量管理框架 日志监控报警 数据验证 安全访问控制