Imagine you need to execute one ore more queries on a big size table, containing the history of the sales of an e-commerce point.
If your hardware resources are limited, scanning the whole table could require several minutes, and this in turn could imply long table locks and waste of time to the users.
Since version 5.1, MySql supports partitioning, a mechanism allowing data to be divided according to access needs.
There are several partitioning types. The most popular mechanisms are:
– RANGE: rows are divided based upon a specified column range
– HASH: the column hash is calculated based upon the record position in the partition resulting from the operation
– LIST: works like RANGE but the values are not surely adjacent
The partitioning of a table works in two different manners:
– during the table creation
CREATE TABLE sales (
id int NOT NULL,
order_date DATETIME NOT NULL,
user bigint NOT NULL,
total float NOT NULL DEFAULT '0',
receipt_id bigint NULL
) ENGINE=InnoDB PARTITION BY RANGE(YEAR(order_date)) (
PARTITION p_history VALUES LESS THAN (2014),
PARTITION p_data VALUES LESS THAN MAXVALUE
);
during the modification of the table structure
ALTER TABLE sales PARTITION BY RANGE(YEAR(order_date)) (
PARTITION p_history VALUES LESS THAN (2014),
PARTITION p_data VALUES LESS THAN MAXVALUE
);
An important tool for debugging the extraction operations is the following command
EXPLAIN PARTITIONS;
The command output shows the partition used by MySql for execution of the specified query.
The partitioning is a very effective tool, but it may give no results, if you don’t choose the proper partitioning strategy: partitioning a sales table according to the creation date and then filtering data based on users, might be useless, independently on the partition criteria, or could even worst the situation.