Back to Curriculum

MySQL Partitioning

📚 Lesson 14 of 15 ⏱️ 40 min

MySQL Partitioning

40 min

MySQL partitioning divides large tables into smaller, more manageable pieces called partitions. Each partition is stored separately but can be queried as if it were a single table. Partitioning improves query performance by enabling partition pruning (skipping irrelevant partitions), makes maintenance operations faster, and helps manage very large tables. Understanding partitioning enables you to manage very large datasets effectively.

Range partitioning divides data based on ranges of values, commonly used for time-series data partitioned by date ranges. For example, you might partition a sales table by year, with each partition containing one year's data. Range partitioning is ideal for data with natural ordering and when queries typically filter by the partition key. MySQL automatically prunes partitions that don't contain relevant data, significantly improving query performance.

List partitioning divides data based on specific values, useful when you have discrete categories. For example, you might partition a customer table by region, with partitions for 'North', 'South', 'East', and 'West'. List partitioning is ideal when data naturally falls into distinct categories and queries filter by these categories. Each partition contains rows matching specific values in the partition key.

Hash partitioning distributes data across partitions using a hash function on the partition key. Hash partitioning ensures relatively even distribution of data across partitions, making it useful when you don't have natural ranges or categories. Hash partitioning is less common than range or list partitioning but is useful for distributing load evenly across partitions. The number of partitions is specified when creating the partitioned table.

Key partitioning is similar to hash partitioning but uses MySQL's internal hashing function and can partition by one or more columns. Key partitioning is simpler than hash partitioning as it doesn't require a hash function. Understanding key partitioning enables flexible partitioning strategies.

Partitioning considerations include choosing appropriate partition keys (columns used in WHERE clauses), determining partition size (too many small partitions or too few large partitions both have drawbacks), and managing partition maintenance (adding new partitions, removing old ones). Partitioning adds complexity to schema design and maintenance, so it should be used when tables are large enough to benefit. Understanding when and how to partition enables you to manage very large datasets effectively.

Key Concepts

  • Partitioning divides large tables into smaller, manageable partitions.
  • Range partitioning divides data by value ranges (commonly dates).
  • List partitioning divides data by specific discrete values.
  • Hash and key partitioning distribute data evenly across partitions.
  • Partition pruning automatically skips irrelevant partitions in queries.

Learning Objectives

Master

  • Creating range, list, and hash partitioned tables
  • Understanding partition pruning and performance benefits
  • Managing partitions (adding, removing, maintaining)
  • Choosing appropriate partition keys and strategies

Develop

  • Understanding large-scale database management
  • Designing partitioned schemas for performance
  • Optimizing queries on partitioned tables

Tips

  • Partition by columns frequently used in WHERE clauses for partition pruning.
  • Use range partitioning for time-series data: PARTITION BY RANGE (YEAR(date_column)).
  • Create partitions in advance for time-series data to avoid errors.
  • Use ALTER TABLE ... ADD PARTITION to add new partitions.

Common Pitfalls

  • Creating too many partitions, causing planning overhead.
  • Not partitioning by query filter columns, missing partition pruning benefits.
  • Forgetting to create partitions, causing INSERT failures.
  • Not maintaining partitions, accumulating old data unnecessarily.

Summary

  • Partitioning improves performance and manageability of large tables.
  • Range, list, and hash partitioning suit different use cases.
  • Partition pruning automatically optimizes queries on partitioned tables.
  • Proper partition management is essential for partitioned table success.

Exercise

Create and manage partitioned tables for better performance.

-- Create partitioned table by date range
CREATE TABLE orders_partitioned (
    id INT AUTO_INCREMENT,
    user_id INT,
    total_amount DECIMAL(10,2),
    order_date DATE,
    status VARCHAR(20),
    PRIMARY KEY (id, order_date)
)
PARTITION BY RANGE (YEAR(order_date)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

-- Insert data into partitioned table
INSERT INTO orders_partitioned (user_id, total_amount, order_date, status) VALUES
(1, 100.00, '2023-01-15', 'completed'),
(2, 200.00, '2024-06-20', 'pending'),
(3, 150.00, '2022-12-10', 'completed');

-- Query specific partition
SELECT * FROM orders_partitioned PARTITION(p2023);

-- Show partition information
SELECT 
    partition_name,
    partition_ordinal_position,
    table_rows
FROM information_schema.partitions
WHERE table_name = 'orders_partitioned';

-- Add new partition
ALTER TABLE orders_partitioned 
ADD PARTITION (PARTITION p2025 VALUES LESS THAN (2026));

-- Remove partition
ALTER TABLE orders_partitioned DROP PARTITION p2022;

-- Reorganize partitions
ALTER TABLE orders_partitioned 
REORGANIZE PARTITION p_future INTO (
    PARTITION p2025 VALUES LESS THAN (2026),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

Exercise Tips

  • Use DEFAULT partition for values that don't match any partition.
  • Monitor partition sizes: check information_schema.partitions for partition information.
  • Use partition pruning: queries with partition key in WHERE automatically skip irrelevant partitions.
  • Plan partition maintenance: add/remove partitions based on data growth patterns.

Code Editor

Output