Synthetic Data Generation and User Simulation PhD Research Intern — Fall 2026
At a glance
AI generatedTL;DR
As a PhD-level researcher joining our cutting-edge team focused on modern model development challenges, you will delve into advanced techniques for generative models and artificial data creation to enhance the training of large language models (LLMs). Your daily tasks include crafting high-fidelity synthetic data through behavioral calibration of simulated users against real signatures, procedural generation of probe scenarios, and trajectory synthesis guided by verification. You will also collaborate with other experts to integrate these innovative methods into production pipelines and validate their impact on downstream model performance. Essential skills for this role include expertise in deep learning frameworks like PyTorch, proficiency in Python, and experience with HuggingFace and vLLM. Ideal candidates have a background in generative modeling, synthetic data generation, or LLM post-training techniques, along with research contributions to top-tier AI conferences.
Skills
What you'll do
- Research innovative techniques in generative models and artificial data creation for LLM training.
- Craft high-fidelity synthetic user simulations calibrated against real behavioral signatures.
- Develop methods to procedurally generate probe coverage and trajectory synthesis guided by verification.
- Conduct experiments to validate that synthetic data improves downstream model performance metrics.
- Integrate novel methods into production training pipelines in collaboration with engineering teams.
What we're looking for
- PhD candidate in Computer Science, Machine Learning, Computational Linguistics, or related field with deep learning specialization.
- Research experience in generative modeling, synthetic data generation, LLM post-training, reward modeling, and interactive simulation.
- Proficient in Python programming and deep learning frameworks like PyTorch and HuggingFace.
- Published research at top-tier AI, ML, or NLP conferences.
- Experience training and evaluating large language models on real-world tasks.
- Background in user simulation, behavioral modeling grounded in real population data, and multilingual/low-resource evaluation.
Employer
About Nvidia
Nvidia is a leading designer of graphics processing units (GPUs) and system-on-chip units, powering gaming, professional visualization, data centers, and artificial intelligence workloads. Industry: Semiconductors & AI Computing
Nvidia currently has 825 open roles on FindRole.
Listed pay typically runs $184,000–$287,500 across 813 roles with salary data.
Most-posted roles
- Senior Solutions Architect, AI Infrastructure 4
- Senior System Software Engineer - AV Platform 4
- Senior Circuit Design Engineer 3
- Senior Circuit Methodology Engineer 3
- Senior Deep Learning Performance Architect 3