Yaqi Duan

Welcome to my homepage! I am an Assistant Professor in the Department of Technology, Operations, and Statistics at Stern School of Business at New York University. My primary research interests lie in machine learning, particularly statistical aspects of reinforcement learning.

I graduated with a Ph.D. degree from the Department of Operations Research and Financial Engineering at Princeton University in 2022. From 2022 to 2023, I was a postdoctoral researcher at the Laboratory for Information & Decision Systems at Massachusetts Institute of Technology, working with Professor Martin J. Wainwright. Prior to my doctoral studies, I received a B.S. in Mathematics from Peking University.

📧 yaqi.duan [At] stern [Dot] nyu [Dot] edu

📬 KMC 8-54, 44 West 4th Street, New York, NY 10012

News

Feb 2025 New paper PILAF: Optimal human preference sampling for reward modeling posted on arXiv!
We introduce PILAF, a simple yet effective algorithm for data collection in RLHF, showing its efficiency both theoretically and empirically.
Dec 2024 New paper Localized exploration in contextual dynamic pricing achieves dimension-free regret posted on arXiv.
Dec 2024 Talk at the RL Theory Seminar.
Oct 2024 Talk at the Department of Statistics, Rutgers University.
Sep 2024 Paper Taming “data-hungry” reinforcement learning? Stability in continuous state-action spaces accepted by NeurIPS 2024.
Sep 2024 Talk at the S. S. Wilks Memorial Seminar in Statistics, Princeton University.
Aug 2024 I am honored to receive my first NSF grant. Grateful for this opportunity!
May 2024 Paper Optimal policy evaluation using kernel-based temporal difference methods accepted by the Annals of Statistics.
Feb 2024 Talk at the Math & Data (MaD) Seminar, New York University.
Jan 2024 New paper Taming “data-hungry” reinforcement learning? Stability in continuous state-action spaces posted on arXiv.
Dec 2023 Paper Adaptive and robust multi-task learning accepted by the Annals of Statistics.
Aug 2023 I’ve joined NYU Stern School of Business as an Assistant Professor in the Department of Technology, Operations, and Statistics. Thrilled to embark on this new academic journey!
Nov 2022 New paper Policy evaluation from a single path: Multi-step methods, mixing and mis-specification posted on arXiv.
Oct 2022 I am honored to receive the 2023 IMS Lawrence D. Brown Ph.D. Student Award.

Selected publications

  1. PILAF: Optimal human preference sampling for reward modeling
    Feng, Yunzhen, Kwiatkowski, Ariel†, Zheng, Kunhao†, Duan, Yaqi*, and Kempe, Julia*
    arXiv Preprint 2025
  2. Localized exploration in contextual dynamic pricing achieves dimension-free regret
    Chai, Jinhang, Duan, Yaqi, Fan, Jianqing, and Wang, Kaizheng (α-β)
    arXiv Preprint 2024
  3. Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
    Duan, Yaqi, and Wainwright, Martin J.
    Advances in Neural Information Processing Systems (NeurIPS) 2024
  4. Optimal policy evaluation using kernel-based temporal difference methods
    Duan, Yaqi, Wang, Mengdi, and Wainwright, Martin J.
    The Annals of Statistics 2024
  5. Adaptive and robust multi-task learning
    Duan, Yaqi, and Wang, Kaizheng (α-β)
    The Annals of Statistics 2023
  6. Policy evaluation from a single path: Multi-step methods, mixing and mis-specification
    Duan, Yaqi, and Wainwright, Martin J.
    arXiv Preprint 2022
  7. Minimax-optimal off-policy evaluation with linear function approximation
    Duan, Yaqi, and Wang, Mengdi
    International Conference on Machine Learning (ICML) 2020
  8. Adaptive low-nonnegative-rank approximation for state aggregation of Markov chains
    Duan, Yaqi, Wang, Mengdi, Wen, Zaiwen, and Yuan, Yaxiang
    SIAM Journal on Matrix Analysis and Applications 2020