polyppit
Polyppit is an open-source software framework for building and executing data processing pipelines. It enables scientists and engineers to compose modular tasks, manage data lineage, and run analyses from a laptop to cloud clusters. The framework emphasizes reproducibility, portability, and collaborative development through a plug-in architecture that supports diverse data formats and compute backends.
Polyppit originated from a collaboration between computational biology and data-engineering researchers. The project began in 2019
Architecture: The core consists of an execution engine, a plugin registry, and a data model for polyplets,
Usage: Polyppit has been applied in genomics, environmental modeling, and large-scale data analytics. It supports common
Reception and governance: The project is maintained by an open community with a merit-based governance model