Dask
Dask is an open-source Python library designed to scale analytics from a single computer to a cluster. It provides parallel computing capabilities and a set of high-level collections and APIs that resemble familiar NumPy and pandas interfaces, enabling large or out‑of‑core computations without changing much of the existing code.
The core components of Dask include Dask arrays, Dask dataframes, and Dask bags, which mirror NumPy arrays,
Dask relies on a distributed scheduler that coordinates multiple workers, a central scheduler, and a dashboard
Dask integrates with the broader Python data ecosystem, working seamlessly with NumPy, pandas, and scikit-learn. It
Originating as an open-source project led by Matthew Rocklin, Dask is maintained by a community of contributors