Error in gradient descent method

Question 1

Hi

what parameters can influence actually the error in gradient descent method?

What is the influence of overfitting, underfitting, start values, hyperparameter tuning?

What happens if I have a gradient descent method, I change a bit the parameters, and my error is suddenly smaller? What could have caused this, please?

I could not find information what influences the error. Thank you

Question 2

The learning rate is by far the most important hyperparameter in the gradient descent. Modern gradient descent also include additional parameters such as momentum but it depends on your optimizer.

The idea of the gradient descent is to find a set of parameters which have a lower error than the current one, so finding a smaller error is the expected result.

Hyperparameters optimization is a thing, but it is quite expensive, so we try to find to test only relevant values in order not to waste compute times.

Question 3

Thank you for your explanations. But I think it does maybe not answer my question.

I optimized parameters with a gradient descent algorithm but somebody achieved an even smaller error by quickly choosing parameters. What could be the reason for this please?

Nathan[IDRIS] · Answer 1 · 2024-01-25T14:12:26+0000

The learning rate is by far the most important hyperparameter in the gradient descent. Modern gradient descent also include additional parameters such as momentum but it depends on your optimizer.

The idea of the gradient descent is to find a set of parameters which have a lower error than the current one, so finding a smaller error is the expected result.

Hyperparameters optimization is a thing, but it is quite expensive, so we try to find to test only relevant values in order not to waste compute times.

Thank you for your explanations. But I think it does maybe not answer my question.

I optimized parameters with a gradient descent algorithm but somebody achieved an even smaller error by quickly choosing parameters. What could be the reason for this please? — anonyme, 25 janvier

Error in gradient descent method

Votre réponse

1 Réponse

Votre commentaire sur cette réponse

Catégories