webcal

Description

Date and time : 9th November 2018, 4.00 - 5.00 pm,
Venue: Ramanujan Hall
Title: Concentration Bounds for Stochastic Approximation with Applications
to Reinforcement Learning
Speaker: Gugan Thoppe
Affiliation: Duke University, Durham, USA

Abstract: Stochastic Approximation (SA) refers to iterative algorithms that
can be used to find optimal points or zeros of a function, given only its
noisy estimates. In this talk, I will review our recent advances in
techniques for analysing SA methods. This talk has four major parts. In the
first part, we will see a motivating application of SA to network
tomography and, alongside, discuss the convergence of a novel stochastic
Kaczmarz method. In the second part, we shall see a novel analysis approach
for non-linear SA methods in the neighbourhood of an isolated solution. The
main tools here include the Alekseev formula, which helps exactly compare
the solutions of a non-linear ODE to that of its perturbation, and a novel
concentration inequality for a sum of martingale differences. In the third
part, we will extend the previous tool to the two timescale but linear SA
setting. Here, I will also present our ongoing work to obtain tight
convergence rates in this setup. In parallel, we will also see how these
results can be applied to gradient Temporal Difference (TD) methods such as
GTD(0), GTD2, and TDC that are used in reinforcement learning. For the
analyses in the second and third parts to hold, the initial step size must
be chosen sufficiently small, depending on unknown problem-dependent
parameters; or, alternatively, one must use projections. In the fourth
part, we shall discuss a trick to obviate this in context of the one
timescale, linear TD(0) method. We strongly believe that this trick is
generalizable. We also provide here a novel expectation bound. We shall end
with some future directions.

Gugan Thoppe, Duke University, Durham, USA