volumeml
volumeml is a Python library designed to facilitate the management and manipulation of large datasets, particularly those residing in object storage like Amazon S3 or Google Cloud Storage. It aims to simplify common machine learning workflows involving data that may not fit into memory. The library provides an interface for reading, writing, and performing basic transformations on data distributed across multiple files, abstracting away the complexities of parallel processing and data partitioning.
Key features of volumeml include its ability to handle data in various formats, such as CSV, Parquet,
volumeml also supports lazy evaluation, meaning that operations are not executed until their results are explicitly