This thesis provides an overview of adaptive stochastic gradient descent methods for large-scale optimization, and elaborates on their background, theoretical properties and practical performance.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results