MySQL officially defines an index as: an Index is a data structure that helps MySQL retrieve data efficiently.

As we know, database queries are one of the most important functions of a database. We all want data to be queried as quickly as possible, so the designers of database systems optimize from the perspective of query algorithms. The most basic query algorithm is, of course, sequential search (linear
search). This O(n) complexity algorithm is obviously poor when the amount of data is large. Fortunately, developments in computer science have provided many much better search algorithms, such as binary search (binary search) and binary tree search (binary tree
search). A slight analysis shows that each search algorithm can only be applied to specific data structures. For example, binary search requires the data being searched to be ordered, while binary tree search can only be applied to a binary search tree. However, the organizational structure of the data itself cannot possibly satisfy all kinds of data structures completely (for example, in theory it is impossible to organize two columns in order at the same time). Therefore, in addition to the data itself, the database system also maintains data structures that satisfy specific search algorithms. These data structures reference (point to) the data in some way, making it possible to implement advanced search algorithms on them. This kind of data structure is an index.
Let’s look at an example:

Figure 1.png

Figure 1 shows one possible indexing method. On the left is a data table with two columns and seven records in total. The far-left side shows the physical addresses of the data records (note that records that are logically adjacent are not necessarily physically adjacent on disk). To speed up searches on Col2, a binary search tree like the one shown on the right can be maintained. Each node contains an index key value and a pointer to the physical address of the corresponding data record, making it possible to use binary search to retrieve the corresponding data with O(log2n) complexity.
Although this is a genuine index, in practice database systems almost never use binary search trees or their evolved variant, the red-black tree, to implement indexes. The reason will be explained below.

B-Tree and B+Tree

At present, most database systems and file systems use B-Tree or its variant B+Tree as their index structure. In the next section of this article, we will discuss why B-Tree and B+Tree are so widely used for indexing in conjunction with storage principles and computer access principles. This section will first describe them purely from the perspective of data structures.

B-Tree

It is a multiway search tree (not a binary tree):
1. Any non-leaf node is defined to have at most M children; and M>2;
2. The root node has [2, M] children;
3. Except for the root node, non-leaf nodes have [M/2, M] children;
4. Each node stores at least M/2-1 (rounded up) and at most M-1 keywords; (at least 2 keywords)
5. For a non-leaf node, the number of keywords = the number of pointers to children – 1;
6. The keywords of a non-leaf node are: K[1], K[2], …, K[M-1]; and K[i] < K[i+1];
7. The pointers of a non-leaf node are: P[1], P[2], …, P[M]; where P[1] points to the
subtree whose keywords are less than K[1], P[M] points to the subtree whose keywords are greater than K[M-1], and the other P[i] point to subtrees whose keywords fall within (K[i-1], K[i]);
8. All leaf nodes are located on the same level;
9. Each k corresponds to one piece of data.
For example: (M=3) it is equivalent to a 2–3 tree. A 2–3 tree is such a tree:
each of its nodes either has 2 children and 1 data element, or 3 children and 2 data elements. Leaf nodes have no children and contain 1 or 2 data elements.

B-tree search starts from the root node and performs a binary search on the ordered sequence of keywords within the node. If a match is found, it ends; otherwise, it moves to the child node corresponding to the range containing the query keyword. This is repeated until the corresponding child pointer is null or a leaf node has been reached. The pseudocode for the search algorithm on a B-Tree is as follows:

BTree_Search(node, key) { if(node == null) return null; foreach(node.key) { if(node.key[i] == key) return node.data[i]; if(node.key[i] > key) return BTree_Search(point[i]->node); } return BTree_Search(point[i+1]->node); } data = BTree_Search(root, my_key);

Characteristics of a B-tree:
1. The set of keywords is distributed throughout the entire tree;
2. Any keyword appears in one and only one node;
3. A search may end at a non-leaf node;
4. Its search performance is equivalent to performing one binary search over the complete set of keywords;
5. Automatic level control;
Self-balancing of the B-tree:
Each internal node in a B-tree contains a certain number of keys. Typically, the number of keys is chosen to be between d and 2d. In practice, the keys occupy most of the space in a node. The factor of 2 ensures that nodes can be split or merged. If an internal node has 2d keys, then the process of adding one more key to this node will split the 2d keys into two nodes with d keys each, and add this key to the parent node. Each split node must contain the minimum number of keys. Similarly, if an internal node and its neighbor both have d keys, then one key will be deleted by merging it with its neighbor. Deleting this key will cause this node to have d-1 keys; merging with the neighbor then adds d keys, plus one key moved from the parent node of the neighboring node. The result is a completely full set of 2d keys.
The construction process of a B-tree:
Below is the sequential insertion into a B-tree of
6 10 4 14 5 11 15 3 2 12 1 7 8 8 6 3 6 21 5 15 15 6 32 23 45 65 7 8
6 5 4

btreebuild.gif

B+Tree

There are many variants of the B-Tree, among which the most common is the B+Tree. For example, MySQL widely uses the B+Tree to implement its index structure.

Compared with the B-Tree, the B+Tree has the following differences:
1. The number of subtree pointers in a non-leaf node is the same as the number of keywords;
2. The subtree pointer P[i] of a non-leaf node points to the subtree whose keyword values belong to [K[i], K[i+1]) (the B-tree uses an open interval);
3. A linked pointer is added for all leaf nodes;
4. All keywords appear in the leaf nodes;
5. Internal nodes do not store data, only keys
For example: (M=3)

Paste_Image.png

The search process of a B+ tree is basically the same as that of a B-tree. The difference is that a B+ tree only hits when it reaches a leaf node (a B-tree can hit at a non-leaf node), and its performance is also equivalent to performing a binary search over the full set of keys;

Characteristics of B+:
1. All keys appear in the linked list of leaf nodes (dense index), and the keys in the list are exactly in order;
2. It is impossible to hit at a non-leaf node;
3. Non-leaf nodes are equivalent to indexes of leaf nodes (sparse indexes), while leaf nodes are equivalent to the data layer that stores (key) data;
4. It is more suitable for file indexing systems;

The construction process of a B+ tree:
Below, the following are inserted into the B+ tree in sequence:
6 10 4 14 5 11 15 3 2 12 1 7 8 8 6 3 6 21 5 15 15 6 32 23 45 65 7 8
6 5 4

Bplustreebuild.gif

Physical storage of indexes

Generally speaking, an index itself is also quite large and cannot be stored entirely in memory, so indexes are often stored on disk in the form of index files. In that case, index lookups will incur disk I/O overhead. Compared with memory access, the cost of I/O access is several orders of magnitude higher, so the most important criterion for evaluating whether a data structure is good as an index is the asymptotic complexity of the number of disk I/O operations during the lookup process. In other words, the structural organization of an index should minimize the number of disk I/O accesses during lookup.

B-tree

Paste_Image.png

Suppose each disk block can hold exactly one B-tree node (exactly two file names). Then one BTNODE node represents one disk block, and the subtree pointers are the addresses where other disk blocks are stored.

Next, let’s simulate the process of finding file 29:
1. Use the root node pointer to find disk block 1, the root disk block of the file directory, and load its information into memory. [1 disk I/O operation]
2. At this point, there are two file names, 17 and 35, in memory, along with three pieces of data storing the addresses of other disk pages. According to the algorithm, we find that 17<29<35, so we locate pointer p2.
3. Based on pointer p2, we locate disk block 3 and load its information into memory. [2 disk I/O operations]
4. At this point, there are two file names, 26 and 30, in memory, along with three pieces of data storing the addresses of other disk pages. According to the algorithm, we find that 26<29<30, so we locate pointer p2.
5. Based on pointer p2, we locate disk block 8 and load its information into memory. [3 disk I/O operations]
6. At this point, there are two file names, 28 and 29, in memory. According to the algorithm, we find file name 29 and determine the disk address of the file in memory.
From the process above, we can see that 3 disk I/O operations and 3 in-memory search operations are required. As for searching file names in memory, since it is an ordered table structure, binary search can be used to improve efficiency. As for I/O operations, they are the decisive factor affecting the overall search efficiency of a B-tree.

Of course, if we use the disk storage structure of a balanced binary tree for searching, it would take 4 disk accesses, up to 5 at most. Also, the more files there are, the fewer disk I/O operations a B-tree will require compared with a balanced binary tree, and the higher its efficiency will be.

B+tree

Paste_Image.png

Advantages of B+tree:
1) B+-tree has lower disk read/write cost
B+-tree internal nodes do not contain pointers to the specific information of keywords. Therefore, their internal nodes are smaller than those of a B
tree. If all the keywords of the same internal node are stored in the same disk block, then that disk block can hold more keywords. More of the keywords to be searched can be loaded into memory at one time. Relatively speaking, the number of I/O reads and writes is also reduced.
For example, suppose one disk block can hold 16 bytes, a keyword takes 2 bytes, and a pointer to the specific information of a keyword takes 2 bytes. The internal node of a 9-order B-tree (a node can contain at most 8 keywords) requires 2 disk blocks, while the internal node of a B+
tree requires only 1 disk block. When an internal node needs to be loaded into memory, a B-tree takes one more disk block lookup time than a B+
tree (on disk, this is the time for the platter to rotate).
2) B+-tree has more stable query efficiency
Since non-terminal nodes are not the nodes that ultimately point to the file contents, but only indexes of keywords in the leaf nodes, any keyword search must follow a path from the root node to a leaf node. All keyword queries have the same path length, resulting in roughly the same query efficiency for every piece of data.

The index storage mechanisms of two MySQL storage engines

MyISAM index implementation

The MyISAM engine uses a B+Tree as its index structure, and the data field of the leaf nodes stores the addresses of data records. The following diagram illustrates how MyISAM indexes work:

Paste_Image.png

Suppose the table has three columns, and we use Col1 as the primary key. Then the figure above is a schematic diagram of the primary index (Primary
key) of a MyISAM table. As can be seen, the MyISAM index file only stores the addresses of data records. In MyISAM, there is no structural difference between the primary index and the secondary index (Secondary
key); the only difference is that the primary index requires the key to be unique, while the secondary index key may contain duplicates. If we create a secondary index on Col2, the structure of that index is shown below:

Paste_Image.png

This is also a B+Tree, and its data field stores the addresses of data records. Therefore, the index retrieval algorithm in MyISAM first searches the index using the B+Tree search algorithm. If the specified key exists, it retrieves the value in its data field, and then uses that value as an address to read the corresponding data record.
MyISAM’s indexing method is also called “non-clustered.”

InnoDB index implementation

Although InnoDB also uses a B+Tree as its index structure, its specific implementation is completely different from that of MyISAM.

The first major difference is that the InnoDB data file itself is an index file. As mentioned above, MyISAM index files and data files are separate, and the index file stores only the addresses of data records. In InnoDB, however, the table data file itself is an index structure organized as a B+Tree, and the data field of this tree’s leaf nodes stores the complete data records. The key of this index is the table’s primary key, so the InnoDB table data file itself is the primary index.

Paste_Image.png

The figure above is a schematic of the InnoDB primary index (which is also the data file). You can see that the leaf nodes contain complete data records. This kind of index is called a clustered index. Because InnoDB’s data file itself must be clustered by the primary key, InnoDB requires that a table have a primary key (MyISAM does not). If one is not explicitly specified, the MySQL system will automatically choose a column that can uniquely identify a data record as the primary key. If no such column exists, MySQL will automatically generate a hidden field for the InnoDB table as the primary key. This field is 6 bytes long and of the long integer type.

The second difference from a MyISAM index is that the data field of an InnoDB secondary index stores the value of the corresponding record’s primary key rather than its address. In other words, all secondary indexes in InnoDB reference the primary key as the data field. For example, a secondary index defined on Col3:

Paste_Image.png

Here, the ASCII codes of English characters are used as the comparison criterion. This implementation of the clustered index makes searches by primary key highly efficient, but searches using a secondary index require two index lookups: first, the secondary index is searched to obtain the primary key, and then the primary key is used to search the primary index to retrieve the record.

Understanding how indexes are implemented in different storage engines is very helpful for correctly using and optimizing indexes. For example, after understanding InnoDB’s index implementation, it is easy to see why using overly long fields as the primary key is not recommended: because all secondary indexes reference the primary index, an overly long primary index will make the secondary indexes excessively large. Another example is that using a non-monotonic field as the primary key is not a good idea in InnoDB, because the InnoDB data file itself is a B+Tree. A non-monotonic primary key will cause the data file to split and rebalance frequently during new record insertion in order to maintain the properties of the B+Tree, which is very inefficient. Using an auto-increment field as the primary key is therefore a very good choice.

Author: 小灰灰besty
Link: http://www.jianshu.com/p/1775b4ff123a
Source: Jianshu
Copyright belongs to the author. For commercial reprints, please contact the author for authorization; for non-commercial reprints, please indicate the source.

MySQL Index Data Structures Explained

B-Tree and B+Tree

B-Tree

B+Tree

Physical storage of indexes

B-tree

B+tree

The index storage mechanisms of two MySQL storage engines

MyISAM index implementation

InnoDB index implementation

Leave a Comment Cancel Reply

RedGate VPN

告别卡顿