高性能数据库树结构

来源:互联网 发布:windows 发送 消息 编辑:程序博客网 时间:2024/05/01 08:22
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

+-------------+----------------------+-----+-----+
| category_id | name | lft | rgt |
+-------------+----------------------+-----+-----+
| 1 | ELECTRONICS | 1 | 20 |
| 2 | TELEVISIONS | 2 | 9 |
| 3 | TUBE | 3 | 4 |
| 4 | LCD | 5 | 6 |
| 5 | PLASMA | 7 | 8 |
| 6 | PORTABLE ELECTRONICS | 10 | 19 |
| 7 | MP3 PLAYERS | 11 | 14 |
| 8 | FLASH | 12 | 13 |
| 9 | CD PLAYERS | 15 | 16 |
| 10 | 2 WAY RADIOS | 17 | 18 |
+-------------+----------------------+-----+-----+

We use lft and rgt because left and right are reserved words in MySQL, see http://dev.mysql.com/doc/mysql/en/reserved-words.html for the full list of reserved words.

So how do we determine left and right values? We start numbering atthe leftmost side of the outer node and continue to the right:

This design can be applied to a typical tree as well:

When working with a tree, we work from left to right, one layer at atime, descending to each node's children before assigning a right-handnumber and moving on to the right. This approach is called the modifiedpreorder tree traversal algorithm.

Retrieving a Full Tree

We can retrieve the full tree through the use of a self-join that links parents with nodes on the basis that a node's lft value will always appear between its parent's lft and rgt values:

SELECT node.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = 'ELECTRONICS'
ORDER BY node.lft;


+----------------------+
| name |
+----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+----------------------+

Unlike our previous examples with the adjacency list model, thisquery will work regardless of the depth of the tree. We do not concernourselves with the rgt value of the node in our BETWEEN clause becausethe rgt value will always fall within the same parent as the lft values.

Finding all the Leaf Nodes

Finding all leaf nodes in the nested set model even simpler than theLEFT JOIN method used in the adjacency list model. If you look at thenested_category table, you may notice that the lft and rgt values forleaf nodes are consecutive numbers. To find the leaf nodes, we look fornodes where rgt = lft + 1:

SELECT name
FROM nested_category
WHERE rgt = lft + 1;


+--------------+
| name |
+--------------+
| TUBE |
| LCD |
| PLASMA |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+--------------+

Retrieving a Single Path

With the nested set model, we can retrieve a single path without having multiple self-joins:

SELECT parent.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'FLASH'
ORDER BY parent.lft;

+----------------------+
| name |
+----------------------+
| ELECTRONICS |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
+----------------------+

Finding the Depth of the Nodes

We have already looked at how to show the entire tree, but what ifwe want to also show the depth of each node in the tree, to betteridentify how each node fits in the hierarchy? This can be done byadding a COUNT function and a GROUP BY clause to our existing query forshowing the entire tree:

SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;

+----------------------+-------+
| name | depth |
+----------------------+-------+
| ELECTRONICS | 0 |
| TELEVISIONS | 1 |
| TUBE | 2 |
| LCD | 2 |
| PLASMA | 2 |
| PORTABLE ELECTRONICS | 1 |
| MP3 PLAYERS | 2 |
| FLASH | 3 |
| CD PLAYERS | 2 |
| 2 WAY RADIOS | 2 |
+----------------------+-------+

We can use the depth value to indent our category names with the CONCAT and REPEAT string functions:

SELECT CONCAT( REPEAT(' ', COUNT(parent.name) - 1), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;

+-----------------------+
| name |
+-----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+-----------------------+

Of course, in a client-side application you will be more likely touse the depth value directly to display your hierarchy. Web developerscould loop through the tree, adding <li></li> and<ul></ul> tags as the depth number increases and decreases.

Depth of a Sub-Tree

When we need depth information for a sub-tree, we cannot limiteither the node or parent tables in our self-join because it willcorrupt our results. Instead, we add a third self-join, along with asub-query to determine the depth that will be the new starting pointfor our sub-tree:

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
nested_category AS parent,
nested_category AS sub_parent,
(
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'PORTABLE ELECTRONICS'
GROUP BY node.name
ORDER BY node.lft
)AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
AND sub_parent.name = sub_tree.name
GROUP BY node.name
ORDER BY node.lft;


+----------------------+-------+
| name | depth |
+----------------------+-------+
| PORTABLE ELECTRONICS | 0 |
| MP3 PLAYERS | 1 |
| FLASH | 2 |
| CD PLAYERS | 1 |
| 2 WAY RADIOS | 1 |
+----------------------+-------+

This function can be used with any node name, including the root node. The depth values are always relative to the named node.

Find the Immediate Subordinates of a Node

Imagine you are showing a category of electronics products on aretailer web site. When a user clicks on a category, you would want toshow the products of that category, as well as list its immediatesub-categories, but not the entire tree of categories beneath it. Forthis, we need to show the node and its immediate sub-nodes, but nofurther down the tree. For example, when showing the PORTABLEELECTRONICS category, we will want to show MP3 PLAYERS, CD PLAYERS, and2 WAY RADIOS, but not FLASH.

This can be easily accomplished by adding a HAVING clause to our previous query:

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
nested_category AS parent,
nested_category AS sub_parent,
(
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'PORTABLE ELECTRONICS'
GROUP BY node.name
ORDER BY node.lft
)AS sub_tree
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.lft;

+----------------------+-------+
| name | depth |
+----------------------+-------+
| PORTABLE ELECTRONICS | 0 |
| MP3 PLAYERS | 1 |
| CD PLAYERS | 1 |
| 2 WAY RADIOS | 1 |
+----------------------+-------+

If you do not wish to show the parent node, change the HAVING depth <= 1 line to HAVING depth = 1.

Aggregate Functions in a Nested Set

Let's add a table of products that we can use to demonstrate aggregate functions with:

CREATE TABLE product(
product_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(40),
category_id INT NOT NULL
);


INSERT INTO product(name, category_id) VALUES('20" TV',3),('36" TV',3),
('Super-LCD 42"',4),('Ultra-Plasma 62"',5),('Value Plasma 38"',5),
('Power-MP3 5gb',7),('Super-Player 1gb',8),('Porta CD',9),('CD To go!',9),
('Family Talk 360',10);

SELECT * FROM product;

+------------+-------------------+-------------+
| product_id | name | category_id |
+------------+-------------------+-------------+
| 1 | 20" TV | 3 |
| 2 | 36" TV | 3 |
| 3 | Super-LCD 42" | 4 |
| 4 | Ultra-Plasma 62" | 5 |
| 5 | Value Plasma 38" | 5 |
| 6 | Power-MP3 128mb | 7 |
| 7 | Super-Shuffle 1gb | 8 |
| 8 | Porta CD | 9 |
| 9 | CD To go! | 9 |
| 10 | Family Talk 360 | 10 |
+------------+-------------------+-------------+

Now let's produce a query that can retrieve our category tree, along with a product count for each category:

SELECT parent.name, COUNT(product.name)
FROM nested_category AS node ,
nested_category AS parent,
product
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.category_id = product.category_id
GROUP BY parent.name
ORDER BY node.lft;


+----------------------+---------------------+
| name | COUNT(product.name) |
+----------------------+---------------------+
| ELECTRONICS | 10 |
| TELEVISIONS | 5 |
| TUBE | 2 |
| LCD | 1 |
| PLASMA | 2 |
| PORTABLE ELECTRONICS | 5 |
| MP3 PLAYERS | 2 |
| FLASH | 1 |
| CD PLAYERS | 2 |
| 2 WAY RADIOS | 1 |
+----------------------+---------------------+

This is our typical whole tree query with a COUNT and GROUP BYadded, along with a reference to the product table and a join betweenthe node and product table in the WHERE clause. As you can see, thereis a count for each category and the count of subcategories isreflected in the parent categories.

Adding New Nodes

Now that we have learned how to query our tree, we should take alook at how to update our tree by adding a new node. Let's look at ournested set diagram again:

If we wanted to add a new node between the TELEVISIONS and PORTABLEELECTRONICS nodes, the new node would have lft and rgt values of 10 and11, and all nodes to its right would have their lft and rgt valuesincreased by two. We would then add the new node with the appropriatelft and rgt values. While this can be done with a stored procedure inMySQL 5, I will assume for the moment that most readers are using 4.1,as it is the latest stable version, and I will isolate my queries witha LOCK TABLES statement instead:

LOCK TABLE nested_category WRITE;


SELECT @myRight := rgt FROM nested_category
WHERE name = 'TELEVISIONS';



UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight;

INSERT INTO nested_category(name, lft, rgt) VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2);

UNLOCK TABLES;

We can then check our nesting with our indented tree query:

SELECT CONCAT( REPEAT( ' ', (COUNT(parent.name) - 1) ), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;


+-----------------------+
| name |
+-----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| GAME CONSOLES |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+-----------------------+

If we instead want to add a node as a child of a node that has noexisting children, we need to modify our procedure slightly. Let's adda new FRS node below the 2 WAY RADIOS node:

LOCK TABLE nested_category WRITE;

SELECT @myLeft := lft FROM nested_category

WHERE name = '2 WAY RADIOS';

UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft;

INSERT INTO nested_category(name, lft, rgt) VALUES('FRS', @myLeft + 1, @myLeft + 2);

UNLOCK TABLES;

In this example we expand everything to the right of the left-handnumber of our proud new parent node, then place the node to the rightof the left-hand value. As you can see, our new node is now properlynested:

SELECT CONCAT( REPEAT( ' ', (COUNT(parent.name) - 1) ), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;


+-----------------------+
| name |
+-----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| GAME CONSOLES |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
| FRS |
+-----------------------+

Deleting Nodes

The last basic task involved in working with nested sets is theremoval of nodes. The course of action you take when deleting a nodedepends on the node's position in the hierarchy; deleting leaf nodes iseasier than deleting nodes with children because we have to handle theorphaned nodes.

When deleting a leaf node, the process if just the opposite ofadding a new node, we delete the node and its width from every node toits right:

LOCK TABLE nested_category WRITE;


SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'GAME CONSOLES';


DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;


UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;

UNLOCK TABLES;

And once again, we execute our indented tree query to confirm that our node has been deleted without corrupting the hierarchy:

SELECT CONCAT( REPEAT( ' ', (COUNT(parent.name) - 1) ), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;


+-----------------------+
| name |
+-----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
| FRS |
+-----------------------+

This approach works equally well to delete a node and all its children:

LOCK TABLE nested_category WRITE;


SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'MP3 PLAYERS';


DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;


UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;

UNLOCK TABLES;

And once again, we query to see that we have successfully deleted an entire sub-tree:

SELECT CONCAT( REPEAT( ' ', (COUNT(parent.name) - 1) ), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;


+-----------------------+
| name |
+-----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| CD PLAYERS |
| 2 WAY RADIOS |
| FRS |
+-----------------------+

The other scenario we have to deal with is the deletion of a parentnode but not the children. In some cases you may wish to just changethe name to a placeholder until a replacement is presented, such aswhen a supervisor is fired. In other cases, the child nodes should allbe moved up to the level of the deleted parent:

LOCK TABLE nested_category WRITE;


SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'PORTABLE ELECTRONICS';


DELETE FROM nested_category WHERE lft = @myLeft;


UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight;

UNLOCK TABLES;

In this case we subtract two from all elements to the right of thenode (since without children it would have a width of two), and onefrom the nodes that are its children (to close the gap created by theloss of the parent's left value). Once again, we can confirm ourelements have been promoted:

SELECT CONCAT( REPEAT( ' ', (COUNT(parent.name) - 1) ), node.name) AS name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;


+---------------+
| name |
+---------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| CD PLAYERS |
| 2 WAY RADIOS |
| FRS |
+---------------+

Other scenarios when deleting nodes would include promoting one ofthe children to the parent position and moving the child nodes under asibling of the parent node, but for the sake of space these scenarioswill not be covered in this article.

原创粉丝点击