Publications

Reinforcement learning

continuous state-action space

  1. Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
    Duan, Yaqi, and Wainwright, Martin J.
    arXiv Preprint 2024

policy evaluation

  1. Policy evaluation from a single path: Multi-step methods, mixing and mis-specification
    Duan, Yaqi, and Wainwright, Martin J.
    arXiv Preprint 2022
  2. Optimal policy evaluation using kernel-based temporal difference methods
    Duan, Yaqi, Wang, Mengdi, and Wainwright, Martin J.
    arXiv Preprint 2021
  3. Minimax-optimal off-policy evaluation with linear function approximation
    Duan, Yaqi, and Wang, Mengdi
    International Conference on Machine Learning (ICML) 2020
  4. A finite-sample analysis of multi-step temporal difference estimates
    Duan, Yaqi, and Wainwright, Martin J.
    Learning for Dynamics and Control Conference (L4DC) 2023
  5. Bootstrapping fitted Q-evaluation for off-policy inference
    Hao, B., Ji, X., Duan, Y., Lu, H., Szepesvári, C., and Wang, M.
    International Conference on Machine Learning (ICML) 2021

off-line reinforcement learning

  1. Risk bounds and Rademacher complexity in batch reinforcement learning
    Duan, Yaqi, Jin, Chi, and Li, Zhiyuan
    International Conference on Machine Learning (ICML) 2021
  2. Near-optimal offline reinforcement learning with linear representation: leveraging variance information with pessimism
    Yin, Ming, Duan, Yaqi, Wang, Mengdi, and Wang, Yu-Xiang
    International Conference on Learning Representations (ICLR) 2022
  3. Sparse feature selection makes batch reinforcement learning more sample efficient
    Hao, B., Duan, Y., Lattimore, T., Szepesvári, C., and Wang, M.
    International Conference on Machine Learning (ICML) 2021

Multi-task learning

multi-task learning

  1. Adaptive and robust multi-task learning
    Duan, Yaqi, and Wang, Kaizheng
    Annals of Statistics 2023

Dimensionality reduction

state aggregation

  1. State aggregation learning from Markov transition data
    Duan, Yaqi, Ke, Zheng (Tracy), and Wang, Mengdi
    Advances in Neural Information Processing Systems (NeurIPS) 2019
  2. Adaptive low-nonnegative-rank approximation for state aggregation of Markov chains
    Duan, Yaqi, Wang, Mengdi, Wen, Zaiwen, and Yuan, Yaxiang
    SIAM Journal on Matrix Analysis and Applications 2020

state embedding

  1. Learning low-dimensional state embeddings and metastable clusters from time series data
    Sun, Yifan, Duan, Yaqi, Gong, Hao, and Wang, Mengdi
    Advances in Neural Information Processing Systems (NeurIPS) 2019
  2. Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition
    Ni, Chengzhuo, Duan, Yaqi, Dahleh, Munther, Wang, Mengdi, and Zhang, Anru R
    Journal of Machine Learning Research 2023
  3. Learning good state and action representations via tensor decomposition
    Ni, Chengzhuo, Zhang, Anru, Duan, Yaqi, and Wang, Mengdi
    IEEE International Symposium on Information Theory (ISIT) 2021