0 votes
par dans 05 - Mathématiques, gradients everywhere !!!
Hi

what parameters can influence actually the error in gradient descent method?

What is the influence of overfitting, underfitting, start values, hyperparameter tuning?

What happens if I have a gradient descent method, I change a bit the parameters, and my error is suddenly smaller? What could have caused this, please?

I could not find information what influences the error. Thank you

1 Réponse

0 votes
par Vétéran du GPU 🐋 (48.7k points)
sélectionné par
 
Meilleure réponse
The learning rate is by far the most important hyperparameter in the gradient descent. Modern gradient descent also include additional parameters such as momentum but it depends on your optimizer.

The idea of the gradient descent is to find a set of parameters which have a lower error than the current one, so finding a smaller error is the expected result.

Hyperparameters optimization is a thing, but it is quite expensive, so we try to find to test only relevant values in order not to waste compute times.
par
Thank you for your explanations.  But I think it does maybe not answer my question.

I optimized parameters with a gradient descent algorithm but somebody achieved an even smaller error by quickly choosing parameters. What could be the reason for this please?
...