Subpartitioning
Subpartitioning is a database design technique that extends partitioning by dividing each partition into smaller, more manageable subpartitions. In a two-level scheme, data is first partitioned using a partition key (the top level), and each partition is further subdivided according to a subpartition key (the second level). Subpartitioning can use various schemes, such as range, list, or hash, and many systems allow the subpartitioning method to be the same as or different from the top-level partitioning method.
- A partition key selects the primary partitions, such as date ranges or regional groups.
- Within each partition, a subpartition key determines the subpartitions, distributing data more finely across storage units.
- Queries that reference the partition and subpartition keys can benefit from partition pruning, reducing I/O by
- Subpartitions can be managed individually for maintenance tasks like archiving, dropping, or backing up.
- Improved query performance for large, partitioned tables by reducing the amount of data scanned.
- Better load balancing and parallelism for I/O and processing, since subpartitions can be processed independently.
- More granular maintenance, allowing operations on specific subpartitions without affecting the entire partition.
- Common in large data warehouses, time-series data, and multi-tenant environments where data can be evenly distributed
- Subpartitioning adds design and maintenance complexity, requiring careful selection of partition and subpartition keys.
- Some queries may not benefit if predicates do not align with partitioning keys.
- Indexes can be local to subpartitions, influencing performance and storage considerations.
- Not all database systems support subpartitioning, or support it for all table types and storage engines.
Examples of use include top-level range partitions by date with subpartitions by hash of a user or