Gradientendescent
Gradientendescent, commonly known as gradient descent, is an iterative optimization method for minimizing differentiable functions. Given an objective function f: R^n → R and an initial parameter vector θ0, the algorithm repeats: compute the gradient g_t = ∇f(θ_t); update θ_{t+1} = θ_t − α_t g_t, where α_t is the step size or learning rate. The gradient points in the direction of greatest increase, so the negative gradient moves toward a minimum. Convergence depends on the learning rate and the properties of f; with a fixed, suitably small α, gradient descent can converge on many convex problems, while in non-convex settings it may reach a local minimum or saddle point.
Variants include batch gradient descent, which uses the full dataset to compute the gradient; stochastic gradient
Practical improvements include momentum, Nesterov acceleration, and adaptive learning-rate methods such as AdaGrad, RMSprop, and Adam.