A Directional Gradient Acceleration Method In Mini-Batch Gradient Descent

Sungjae Ahn


	International Journal of Engineering and Information Systems (IJEAIS)
	Year: 2022 \| Volume: 6 \| Issue: 11 \| Page No.: 22-30

A Directional Gradient Acceleration Method In Mini-Batch Gradient Descent Download PDF

Sungjae Ahn

Abstract: The time needed to train a large-scale neural network with a large-scale dataset is immense. There were many studies that attempted to solve this problem using large batches, but bigger batch sizes decrease the average value of the gradient, resulting in a slower training speed. Bigger learning rates are therefore used to address this side effect but finding the optimal learning rate consumes a large amount of time as well. To assist the training of the model, this paper proposes the method of Direction Accelerated Stochastic Gradient Descent-a method that adds the average of the previous and present gradient to the present gradient if the direction of the two gradients is equal. The experiment compared the accuracy rate of vanilla SGD and DA-SGD with three different datasets, and the results show that DA-SGD yields greater accuracy rates for batch sizes bigger than 32. Further research of comparing DA-SGD with momentum optimizers, which are like the proposed method from the point that they take the previous gradients into consideration for updating the weights, is needed.