Paper Optimal policy evaluation using kernel-based temporal difference methods accepted by the Annals of Statistics.