Asimov Technologies
J2EE, JEE, PHP
Frank D. Martínez’s Blog
Software & Fun
Welcome to my technical blog. Here you can find my opinions and proposals over many software development things, mainly on open source projects.
Materialized Path + Nested Sets
I Have found four methods used to encode tree structures in relational databases:
- Adjacency List
- Materialized Paths
- Nested Sets
- Nested intervals
Please read this good article first: http://www.dbazine.com/oracle/or-articles/tropashko4
Each one has its advantages and disadvantages, and I have used Adjacency Lists and Materialized paths many times in different kinds of applications.
Now I will try a mixed approach: Materialized Path + Nested Sets.
The basic idea is to use Materialized paths and at the same time use the path as a boundary of nested sets (children). This method is not suitable for all problems but it is enough in many cases. You can Insert and Delete Tree elements with a simple SQL sentence and extract hierarchic information and aggregated values with simple queries.
Lets do it with an example:
Account catalog:
There is an account catalog with a code, name and type(M=Internal node, D=Leaf node) fileds.
The problem:
We want to store account balances of the leaf nodes and then make a query that retrieves the all tree with all aggregated balances calculated by the fly.
The schema:
First we want to redesign the database and separate the account’s data from the tree structure:
Now we can encode the tree structure using a mixed form of materilized path and nested sets in the account_struct table and store balances and account data in the account table.
Sample data:
Now insert some data to play with:
The Idea is simple: left_ is the materialized path and right_ is the materialized path with an endign ‘Z’ character. So we can make queries using left_ between ‘1.1′ and ‘1.1Z’ instead of left_ like ‘1.1%’
It is all. Now we can make some interesting queries:
1. All ancestors of ‘1.1.2.10′
2. All children of ‘1.2′
3. General Balance sheet
4. Balance sheet of ‘1.1′
5. Balance of ‘1.2.1.05′ and its ancestors
I will play with this method in a small accounting (book keeping) application. I will publish any issue if I found any.
Download the SQL sample code:
6 Responses to “Materialized Path + Nested Sets”
Leave a Reply
This site contains personal opinions about many things and does not intend to be normative in any aspect.
Java, J2EE, J2SE, JEE, Are trademarks of Sun Microsystems Inc.















Why would we need `_right` column? We can use between even without it.
Look at queries on:
1. All ancestors of ‘1.1.2.10′
5. Balance of ‘1.2.1.05′ and its ancestors
_left and _right are boundaries of super sets, they are needed for this kind of hierarchical queries.
It could be …BETWEEN s.left_ AND s.left_ || ‘Z’…
No need for one more column.
And is it faster then LIKE anyway? I do not have enough data to run benchmarks yet. Did you run?
Hi rusty,
Thanks for your interest in this blog.
Yes you are right. But with _right column you avoid the string concatenation operation.
I used this model and it is too easy to understand an too easy to explain to others in terms of set boundaries instead of in terms of string operations.
You can implement it using the concatenation if you are more comfortable with it.
Just for curiosity, In what scenario are you using this pattern?
What do you think about using an index on left_||’Z’ instead a real column ?
Thanks for your comment, an index on left_||’Z’ is a very good idea.
Maybe not all databases support it, but if your database support expression indexes this is a really good improvement.
The bad side is its decreased readability.