Publications

2024

  1. Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
    Duan, Yaqi, and Wainwright, Martin J.
    arXiv Preprint 2024

2023

  1. Adaptive and robust multi-task learning
    Duan, Yaqi, and Wang, Kaizheng
    Annals of Statistics 2023
  2. A finite-sample analysis of multi-step temporal difference estimates
    Duan, Yaqi, and Wainwright, Martin J.
    Learning for Dynamics and Control Conference (L4DC) 2023
  3. Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition
    Ni, Chengzhuo, Duan, Yaqi, Dahleh, Munther, Wang, Mengdi, and Zhang, Anru R
    Journal of Machine Learning Research 2023

2022

  1. Policy evaluation from a single path: Multi-step methods, mixing and mis-specification
    Duan, Yaqi, and Wainwright, Martin J.
    arXiv Preprint 2022
  2. Near-optimal offline reinforcement learning with linear representation: leveraging variance information with pessimism
    Yin, Ming, Duan, Yaqi, Wang, Mengdi, and Wang, Yu-Xiang
    International Conference on Learning Representations (ICLR) 2022

2021

  1. Optimal policy evaluation using kernel-based temporal difference methods
    Duan, Yaqi, Wang, Mengdi, and Wainwright, Martin J.
    arXiv Preprint 2021
  2. Risk bounds and Rademacher complexity in batch reinforcement learning
    Duan, Yaqi, Jin, Chi, and Li, Zhiyuan
    International Conference on Machine Learning (ICML) 2021
  3. Bootstrapping fitted Q-evaluation for off-policy inference
    Hao, B., Ji, X., Duan, Y., Lu, H., Szepesvári, C., and Wang, M.
    International Conference on Machine Learning (ICML) 2021
  4. Sparse feature selection makes batch reinforcement learning more sample efficient
    Hao, B., Duan, Y., Lattimore, T., Szepesvári, C., and Wang, M.
    International Conference on Machine Learning (ICML) 2021
  5. Learning good state and action representations via tensor decomposition
    Ni, Chengzhuo, Zhang, Anru, Duan, Yaqi, and Wang, Mengdi
    IEEE International Symposium on Information Theory (ISIT) 2021

2020

  1. Minimax-optimal off-policy evaluation with linear function approximation
    Duan, Yaqi, and Wang, Mengdi
    International Conference on Machine Learning (ICML) 2020
  2. Adaptive low-nonnegative-rank approximation for state aggregation of Markov chains
    Duan, Yaqi, Wang, Mengdi, Wen, Zaiwen, and Yuan, Yaxiang
    SIAM Journal on Matrix Analysis and Applications 2020

2019

  1. State aggregation learning from Markov transition data
    Duan, Yaqi, Ke, Zheng (Tracy), and Wang, Mengdi
    Advances in Neural Information Processing Systems (NeurIPS) 2019
  2. Learning low-dimensional state embeddings and metastable clusters from time series data
    Sun, Yifan, Duan, Yaqi, Gong, Hao, and Wang, Mengdi
    Advances in Neural Information Processing Systems (NeurIPS) 2019