Data Research Engineer | Microsoft Careers

Microsoft

Closes today Hybrid

Quick summary

Work type
Hybrid
Location
Salary
$119,800–$234,700 / yr
Posted
179 days ago
Closes
Jun 6, 2026 (soon)

Market check

Salary context

Competitive pay

How this pay compares to similar roles

Similar $188k
This role $177k
$106k most similar roles pay here $248k

This role pays less than 52% of similar roles. Most pay $160,000–$215,212 — the shaded band above. At the midpoint, this role pays about $177k versus about $188k for comparable roles.

Based on 240 similar postings.

Employer

About Microsoft

Microsoft Corporation is a global technology leader producing software, hardware, and cloud services including Windows, Office 365, Azure cloud platform, Xbox gaming, and Surface devices. Industry: Software & Cloud Computing

Microsoft currently has 310 open roles on FindRole.

Listed pay typically runs $119,800–$234,700 across 285 roles with salary data.

Most-posted roles

View all roles at Microsoft

At a glance

TL;DR · Data Research Engineer | Microsoft Careers

Join our Multimodal team as a Data Research Engineer, where you will design and curate high-quality datasets for advanced AI models across vision, language, audio, and more. Your day-to-day involves developing innovative data collection strategies, improving dataset quality, analyzing multimodal data to assess model behaviors, and ensuring ethical standards are met. You’ll create scalable pipelines using Python and libraries like Pandas and NumPy, analyze large-scale datasets, build tools for auditing and visualization, and collaborate with safety teams to uphold responsible AI practices. Ideal candidates have a strong background in AI, Computer Science, or related fields, along with extensive experience in data analysis and engineering, proficiency in statistics, and familiarity with frameworks such as Spark and Apache Beam.

What you'll do

  • Develop novel data collection strategies for high-quality datasets.
  • Maintain scalable multimodal data pipelines for ingestion and preprocessing.
  • Analyze real-world datasets to assess quality and identify improvement areas.
  • Build tools for dataset auditing, visualization, and versioning.
  • Ensure datasets meet ethical and responsible AI standards.

What we're looking for

  • Bachelor's degree in a relevant field and 4+ years of experience with Python and data libraries.
  • Master's degree in a relevant field and 8+ years of experience with Python and data libraries, or equivalent experience.
  • At least 2 years of experience in data analysis or engineering with large-scale datasets.
  • Proficiency in statistics and exploratory data analysis methods.
  • Familiarity with data processing frameworks like Spark, Ray, or Apache Beam.

More like this

Similar roles

Data Engineer, Staff

Qualcomm

San Diego, CA 25 days ago $132,000$198,000
Databricks AWS Python Delta Lake Unity Catalog SQL NoSQL CI/CD Kafka Spark Hadoop Terraform Git Jenkins Prometheus Grafana Snowflake Redshift PostgreSQL MongoDB Kubernetes Docker Airflow Vault Fivetran HVR Data Lineage Tools AI/ML Platforms

Infrastructure Data & Analytics | Microsoft Careers

Microsoft

California 116 days ago $142,800$274,800
Python SQL Distributed_data_processing_frameworks ETL_orchestration Data_warehousing Self_service_dashboards API_design Cloud_services Data_quality_control Data_governance Metric_standardization CI/CD Kubernetes Terraform Prometheus Grafana
Hybrid

| Microsoft Careers

Microsoft

US 53 days ago $142,800$274,800
Python Java Spark SQL Apache_Hadoop Kafka NoSQL Azure AWS GCP CI/CD Docker Kubernetes Terraform PostgreSQL MSSQL
Hybrid

Software Co-Design AI HPC Systems | Microsoft Careers

Microsoft

US 113 days ago $142,800$274,800
Python C/C++ CUDA Distributed Systems HPC ML Systems Runtimes Compilers Performance Modeling Benchmarking Systems Analysis Hardware-Silicon Co-Design AI Accelerators GPU Architectures NCCL MPI RDMA InfiniBand CI/CD
Hybrid

| Microsoft Careers

Microsoft

Mountain View, CA 141 days ago $165,600$296,400
Spark Ray Vector databases Data pipelines Python CI/CD AWS Kubernetes Docker PostgreSQL MySQL MongoDB Git Jenkins Prometheus Grafana