Microbatching
Microbatching refers to the practice of dividing a stream of data or a workload into small batches, smaller than a conventional large batch, and processing each batch as a discrete unit. It aims to balance the low latency of real-time processing with the throughput benefits of batch computing. Micro-batches are typically fixed in size or duration, and the system processes them in a staged pipeline.
In streaming systems, micro-batching is used to transform an unbounded data stream into a sequence of finite
In machine learning and deployment, micro-batching serves two related roles. For inference serving, requests are grouped
Advantages include lower latency than large-batch processing, better resource utilization, and compatibility with existing batch-oriented pipelines.