Nettsider med emneord «generalization»

Overparameterization for SGD under Deep Learning Practice

Publisert 10. aug. 2023 21:23

In this project, we addresses fundamental question of “How much should we overparameterize a NN?” with a focus on genralizaiton and common practice in DL such as SGD, nonsmooth activations, and implicit/explicit regularizations. For smooth activations and gradient descent, we established current best scaling on the number of parameters for fully-trained shallow NNs under standard initialization schemes [1].

RSS-strøm fra denne siden

Nettsider under «www.mn.uio.no» med emneord «generalization»