DistGrv - Infinite Lexicon - Infinite Lexicon

DistGrv

DistGrv is a class of optimization methods for distributed training that employ gradient variance reduction to accelerate convergence on finite-sum objectives common in machine learning. It extends classic variance-reduced techniques such as SVRG and SAGA to multi-node environments, combining local stochastic gradients with a periodically refreshed reference gradient computed from data distributed across workers.

Conceptually, each worker maintains a local parameter copy and a snapshot of its gradient; at regular intervals

DistGrv is applicable to problems with smooth finite-sum objectives, including convex and some non-convex models such

Practical considerations include choosing the update cadence for the reference gradient, balancing computation and communication, and

a

a

a

a

a

asynchronously.