TorchServe
TorchServe is an open-source model serving framework for PyTorch designed to simplify deploying trained PyTorch models for production-scale inference. It enables multi-model serving, model versioning, dynamic batching, and scalable deployment with minimal additional code. The project is part of the PyTorch ecosystem and maintained by the PyTorch community with contributions from industry and research teams, originally developed by AWS in collaboration with Meta (Facebook) and the broader ecosystem. TorchServe is released under the Apache 2.0 license.
Core features include a model-archiver tool to package models into MAR files that bundle the serialized model,
Deployment can be performed in containers or on Kubernetes, enabling easy integration into existing ML pipelines