<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Optimize Hierarchy Queries with a Transitive Closure Table</title>
	<atom:link href="http://kylecordes.com/2008/transitive-closure/feed" rel="self" type="application/rss+xml" />
	<link>http://kylecordes.com/2008/transitive-closure</link>
	<description>Software, Business, and Life</description>
	<lastBuildDate>Tue, 20 Dec 2011 16:22:18 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: IVO GELOV</title>
		<link>http://kylecordes.com/2008/transitive-closure/comment-page-1#comment-25185</link>
		<dc:creator>IVO GELOV</dc:creator>
		<pubDate>Tue, 16 Sep 2008 16:07:25 +0000</pubDate>
		<guid isPermaLink="false">http://kylecordes.com/2008/01/13/transitive-closure/#comment-25185</guid>
		<description>d) get path from node A to node B
      SELECT title,depth FROM helper 
        LEFT JOIN entity ON ancestor_id=entity.id
        WHERE child_id=:B AND depth &lt;= 
        (
          SELECT depth FROM helper WHERE child_id=:B AND ancestor_id=:A
        ) ORDER BY depth desc;

4. Updating ownership - this is done with an UPDATE-after trigger

CREATE OR REPLACE FUNCTION &quot;proba&quot;.&quot;entity_del&quot; () RETURNS trigger AS
$body$
BEGIN
	-- first remove edges between all old parents of node and its descendants
  DELETE FROM proba.helper WHERE (child_id=old.id OR child_id IN
  	(SELECT child_id FROM proba.helper WHERE ancestor_id = old.id)
    ) AND ancestor_id IN
    (SELECT ancestor_id FROM proba.helper WHERE child_id = old.id);
  -- then add edges for all new parents ...
	IF new.parent_id IS NOT NULL THEN
  	INSERT INTO proba.helper(child_id,ancestor_id,depth) 
    	SELECT new.id,new.parent_id,1 
      UNION ALL
      -- ... to node itself
      SELECT new.id,ancestor_id,depth+1 FROM proba.helper WHERE child_id=new.parent_id
      UNION ALL        
      -- ... and its descendants
      (                         
      	SELECT child_id,new.parent_id,depth+1 FROM proba.helper WHERE ancestor_id=new.id
        UNION ALL
      	SELECT child_id,ancestor_id,depth FROM
      	(SELECT child_id FROM proba.helper WHERE ancestor_id=new.id) AS child
        CROSS JOIN
        (SELECT ancestor_id,depth+2 AS depth FROM proba.helper WHERE child_id=new.parent_id) AS parent
      );
  END IF;
  RETURN NULL;
END;
$body$
LANGUAGE &#039;plpgsql&#039;
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

CREATE TRIGGER &quot;entity_tr_del&quot; AFTER UPDATE 
ON &quot;proba&quot;.&quot;entity&quot; FOR EACH ROW 
EXECUTE PROCEDURE &quot;proba&quot;.&quot;entity_del&quot;();


Updating ownership is the most expensive operation, since it envolves a CROSS JOIN (Cartesian product).
May be someone would make a benchmarks and compare &quot;closure table&quot; with &quot;Nested intervals
using Fray fractions&quot;</description>
		<content:encoded><![CDATA[<p>d) get path from node A to node B<br />
      SELECT title,depth FROM helper<br />
        LEFT JOIN entity ON ancestor_id=entity.id<br />
        WHERE child_id=:B AND depth &lt;=<br />
        (<br />
          SELECT depth FROM helper WHERE child_id=:B AND ancestor_id=:A<br />
        ) ORDER BY depth desc;</p>
<p>4. Updating ownership &#8211; this is done with an UPDATE-after trigger</p>
<p>CREATE OR REPLACE FUNCTION &#8220;proba&#8221;.&#8221;entity_del&#8221; () RETURNS trigger AS<br />
$body$<br />
BEGIN<br />
	&#8211; first remove edges between all old parents of node and its descendants<br />
  DELETE FROM proba.helper WHERE (child_id=old.id OR child_id IN<br />
  	(SELECT child_id FROM proba.helper WHERE ancestor_id = old.id)<br />
    ) AND ancestor_id IN<br />
    (SELECT ancestor_id FROM proba.helper WHERE child_id = old.id);<br />
  &#8212; then add edges for all new parents &#8230;<br />
	IF new.parent_id IS NOT NULL THEN<br />
  	INSERT INTO proba.helper(child_id,ancestor_id,depth)<br />
    	SELECT new.id,new.parent_id,1<br />
      UNION ALL<br />
      &#8212; &#8230; to node itself<br />
      SELECT new.id,ancestor_id,depth+1 FROM proba.helper WHERE child_id=new.parent_id<br />
      UNION ALL<br />
      &#8212; &#8230; and its descendants<br />
      (<br />
      	SELECT child_id,new.parent_id,depth+1 FROM proba.helper WHERE ancestor_id=new.id<br />
        UNION ALL<br />
      	SELECT child_id,ancestor_id,depth FROM<br />
      	(SELECT child_id FROM proba.helper WHERE ancestor_id=new.id) AS child<br />
        CROSS JOIN<br />
        (SELECT ancestor_id,depth+2 AS depth FROM proba.helper WHERE child_id=new.parent_id) AS parent<br />
      );<br />
  END IF;<br />
  RETURN NULL;<br />
END;<br />
$body$<br />
LANGUAGE &#8216;plpgsql&#8217;<br />
VOLATILE<br />
CALLED ON NULL INPUT<br />
SECURITY INVOKER<br />
COST 100;</p>
<p>CREATE TRIGGER &#8220;entity_tr_del&#8221; AFTER UPDATE<br />
ON &#8220;proba&#8221;.&#8221;entity&#8221; FOR EACH ROW<br />
EXECUTE PROCEDURE &#8220;proba&#8221;.&#8221;entity_del&#8221;();</p>
<p>Updating ownership is the most expensive operation, since it envolves a CROSS JOIN (Cartesian product).<br />
May be someone would make a benchmarks and compare &#8220;closure table&#8221; with &#8220;Nested intervals<br />
using Fray fractions&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: IVO GELOV</title>
		<link>http://kylecordes.com/2008/transitive-closure/comment-page-1#comment-25183</link>
		<dc:creator>IVO GELOV</dc:creator>
		<pubDate>Tue, 16 Sep 2008 16:06:21 +0000</pubDate>
		<guid isPermaLink="false">http://kylecordes.com/2008/01/13/transitive-closure/#comment-25183</guid>
		<description>I&#039;m sorry for the flood, but it seems that my post is too long for Wordpress ...

   d) get path from node A to node B
      SELECT title,depth FROM helper 
        LEFT JOIN entity ON ancestor_id=entity.id
        WHERE child_id=:B AND depth </description>
		<content:encoded><![CDATA[<p>I&#8217;m sorry for the flood, but it seems that my post is too long for WordPress &#8230;</p>
<p>   d) get path from node A to node B<br />
      SELECT title,depth FROM helper<br />
        LEFT JOIN entity ON ancestor_id=entity.id<br />
        WHERE child_id=:B AND depth</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: IVO GELOV</title>
		<link>http://kylecordes.com/2008/transitive-closure/comment-page-1#comment-25182</link>
		<dc:creator>IVO GELOV</dc:creator>
		<pubDate>Tue, 16 Sep 2008 16:04:38 +0000</pubDate>
		<guid isPermaLink="false">http://kylecordes.com/2008/01/13/transitive-closure/#comment-25182</guid>
		<description>d) get path from node A to node B
      SELECT title,depth FROM helper 
        LEFT JOIN entity ON ancestor_id=entity.id
        WHERE child_id=:B AND depth </description>
		<content:encoded><![CDATA[<p>d) get path from node A to node B<br />
      SELECT title,depth FROM helper<br />
        LEFT JOIN entity ON ancestor_id=entity.id<br />
        WHERE child_id=:B AND depth</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: IVO GELOV</title>
		<link>http://kylecordes.com/2008/transitive-closure/comment-page-1#comment-25181</link>
		<dc:creator>IVO GELOV</dc:creator>
		<pubDate>Tue, 16 Sep 2008 16:03:07 +0000</pubDate>
		<guid isPermaLink="false">http://kylecordes.com/2008/01/13/transitive-closure/#comment-25181</guid>
		<description>Okay, let me explain it in more details. I will use PostgreSQL for examples,
hoping that this will not prevent you from understanding my examples.
First, we have this table definition:

CREATE TABLE &quot;proba&quot;.&quot;entity&quot; (
  &quot;id&quot; INTEGER NOT NULL, 
  &quot;title&quot; TEXT NOT NULL, 
  &quot;parent_id&quot; INTEGER, 
  CONSTRAINT &quot;entity_pkey&quot; PRIMARY KEY(&quot;id&quot;), 
  CONSTRAINT &quot;entity_chk&quot; CHECK (parent_id  id),
  CONSTRAINT &quot;parent_fk&quot; FOREIGN KEY (&quot;parent_id&quot;)
    REFERENCES &quot;proba&quot;.&quot;entity&quot;(&quot;id&quot;)
    ON DELETE CASCADE
    ON UPDATE CASCADE
    NOT DEFERRABLE
) WITHOUT OIDS;

CREATE INDEX &quot;parent_idx&quot; ON &quot;proba&quot;.&quot;entity&quot;
  USING btree (&quot;parent_id&quot;);

Suppose, that it is populated with the following data:

+----+--------+-----------+
&#124; ID &#124; TITLE  &#124; PARENT_ID &#124;
+----+--------+-----------+
&#124;  1 &#124; RUZA   &#124;   NULL    &#124;
&#124;  2 &#124; SVETA  &#124;      1    &#124;
&#124;  3 &#124; SONIA  &#124;      2    &#124;
&#124;  4 &#124; PAUL   &#124;      2    &#124;
&#124;  5 &#124; TEDY   &#124;      1    &#124;
&#124;  6 &#124; MARY   &#124;      5    &#124;
&#124;  7 &#124; RUSLAN &#124;      5    &#124;
&#124;  8 &#124; MITKO  &#124;      7    &#124;
&#124;  9 &#124; PETER  &#124;      7    &#124;
&#124; 10 &#124; MILEN  &#124;   NULL    &#124;
&#124; 11 &#124; ALEX   &#124;     10    &#124;
&#124; 12 &#124; ANTON  &#124;     10    &#124;
&#124; 13 &#124; BOBY   &#124;     12    &#124;
&#124; 14 &#124; NADIA  &#124;     12    &#124;
+----+--------+-----------+

It represents the followng tree:

ROOT (NULL)
 &#124;
 +----- MILEN (10)
 &#124;        &#124;
 &#124;        +---- ALEX (11)
 &#124;        &#124;
 &#124;        +---- ANTON (12)
 &#124;                &#124;
 &#124;                +---- BOBY (13)
 &#124;                &#124;
 &#124;                +---- NADIA (14)
 &#124;
 +----- RUZA (1)
         &#124;
         +---- SVETA (2)
         &#124;       &#124;
         &#124;       +---- SONIA (3)
         &#124;       &#124;
         &#124;       +---- PAUL (4)
         &#124;
         +---- TEDY (5)
                 &#124;
                 +---- MARY (6)
                 &#124;
                 +---- RUSLAN (7)
                          &#124;
                          +---- MITKO (8)
                          &#124;
                          +---- PETER (9)

This is a classic &quot;Adjacency List&quot; model (I prefer to call it &quot;Parent-Child&quot; model). 
Now, if we add another column to the table (let&#039;s call it PARENT_ID_2) - we can use it
to store the parents of the PARENT_ID, like this:

+----+--------+-----------+-------------+
&#124; ID &#124; TITLE  &#124; PARENT_ID &#124; PARENT_ID_2 &#124;
+----+--------+-----------+-------------+
&#124;  1 &#124; RUZA   &#124;   NULL    &#124;    NULL     &#124;
&#124;  2 &#124; SVETA  &#124;      1    &#124;    NULL     &#124;
&#124;  3 &#124; SONIA  &#124;      2    &#124;       1     &#124;
&#124;  4 &#124; PAUL   &#124;      2    &#124;       1     &#124;
&#124;  5 &#124; TEDY   &#124;      1    &#124;    NULL     &#124;
&#124;  6 &#124; MARY   &#124;      5    &#124;       1     &#124;
&#124;  7 &#124; RUSLAN &#124;      5    &#124;       1     &#124;
&#124;  8 &#124; MITKO  &#124;      7    &#124;       5     &#124;
&#124;  9 &#124; PETER  &#124;      7    &#124;       5     &#124;
&#124; 10 &#124; MILEN  &#124;   NULL    &#124;    NULL     &#124;
&#124; 11 &#124; ALEX   &#124;     10    &#124;    NULL     &#124;
&#124; 12 &#124; ANTON  &#124;     10    &#124;    NULL     &#124;
&#124; 13 &#124; BOBY   &#124;     12    &#124;      10     &#124;
&#124; 14 &#124; NADIA  &#124;     12    &#124;      10     &#124;
+----+--------+-----------+-------------+

We can continue this with another column to store the parents of PARENT_ID_2, and so on.
The problem here is, that we do not know in advance how many levels down the tree will go.
Of course, we can implement code to dynamically ALTER table definition to add additional
columns when needed - but this is a bad practice. The first lesson I have learned about
SQL tables is:
The columns should remain fixed and never be altered by the code. Instead, when we need
variable number of columns - transform them into rows.
This transformation is done with the help of additional table (such kind of manipulations
are also used to maintain 1:1, 1:m and n:m relationships in database theory). 
I have defined this additional table in this way:

CREATE TABLE &quot;proba&quot;.&quot;helper&quot; (
  &quot;child_id&quot; INTEGER NOT NULL, 
  &quot;ancestor_id&quot; INTEGER NOT NULL, 
  &quot;depth&quot; INTEGER NOT NULL, 
  CONSTRAINT &quot;helper_idx&quot; UNIQUE(&quot;ancestor_id&quot;, &quot;child_id&quot;), 
  CONSTRAINT &quot;helper_pkey&quot; PRIMARY KEY(&quot;child_id&quot;, &quot;ancestor_id&quot;), 
  CONSTRAINT &quot;helper_chk&quot; CHECK ((child_id  ancestor_id) AND (depth &gt; 0)), 
  CONSTRAINT &quot;ancestor_fk&quot; FOREIGN KEY (&quot;ancestor_id&quot;)
    REFERENCES &quot;proba&quot;.&quot;entity&quot;(&quot;id&quot;)
    ON DELETE CASCADE
    ON UPDATE CASCADE
    NOT DEFERRABLE, 
  CONSTRAINT &quot;child_fk&quot; FOREIGN KEY (&quot;child_id&quot;)
    REFERENCES &quot;proba&quot;.&quot;entity&quot;(&quot;id&quot;)
    ON DELETE CASCADE
    ON UPDATE CASCADE
    NOT DEFERRABLE
) WITHOUT OIDS;

CREATE INDEX &quot;ancestor_idx&quot; ON &quot;proba&quot;.&quot;helper&quot;
  USING btree (&quot;ancestor_id&quot;);

CREATE INDEX &quot;child_idx&quot; ON &quot;proba&quot;.&quot;helper&quot;
  USING btree (&quot;child_id&quot;);

For the already shown above example data inside table ENTITY, the data in table HELPER shuld be:

+----------+-------------+-------+
&#124; CHILD_ID &#124; ANCESTOR_ID &#124; DEPTH &#124;
+----------+-------------+-------+
&#124;     3    &#124;       1     &#124;   2   &#124;
&#124;     4    &#124;       1     &#124;   2   &#124;
&#124;     6    &#124;       1     &#124;   2   &#124;
&#124;     7    &#124;       1     &#124;   2   &#124;
&#124;     8    &#124;       5     &#124;   2   &#124;
&#124;     9    &#124;       5     &#124;   2   &#124;
&#124;    13    &#124;      10     &#124;   2   &#124;
&#124;    14    &#124;      10     &#124;   2   &#124;
&#124;     8    &#124;       1     &#124;   3   &#124;
&#124;     9    &#124;       1     &#124;   3   &#124;
+----------+-------------+-------+

As we see, table HELPER does not contain NULL values, and lists all indirect descendants 
for each node along with the distance between them (distance of 1 is for direct descendants).

After a short thinking, I decided to include
all nodes&#039; descendatns in table HELPER (not only indirect ones).

The number of rows in table HELPER is calculated by this formula:

  N
 -----
 \
  \      D[i]
  /
 /
 -----
  i=1

N - is the number of nodes in tree (number of rows in table ENTITY)
D[i] - is the number of edges from node &quot;i&quot; to the root

As far as I know, adjacency list model is very cheap for inserting and updating, but a way too
expensive for retrieving and deleting. The nested sets model (and also nested intervals) is
the opposite - very cheap for retrieving and deleting, but too expensive for inserting and
updating. So, it seems that there are two opposite options - one model for often insertions
and seldom extractions, and another model for seldom insertions and often extractions (one
for the expense of the other).
I think that with closure tables it is possible to hit 2 rabbits with a single bullet - to combine
good ones together and to shift the poor performance to a seldom used actions.
What I mean ? I still do not have any benchmarks and all the above stuff needs carefull
investigation and research, but I hope that with this tree model inserting, deleting and 
retrieving nodes will be cheap operations, and only moving subtrees will be expensive - but
not so much as in other 2 models.

Here are the SQL queries for all 4 possible tree manipulations:

1. Adding new node - this is done with an INSERT-after trigger

CREATE OR REPLACE FUNCTION &quot;proba&quot;.&quot;entity_ins&quot; () RETURNS trigger AS
$body$
BEGIN
	IF new.parent_id IS NOT NULL THEN
  	INSERT INTO proba.helper(child_id,ancestor_id,depth) 
    	SELECT new.id,new.parent_id,1 UNION ALL
      SELECT new.id,ancestor_id,depth+1 FROM proba.helper WHERE child_id=new.parent_id;
  END IF;
  RETURN NULL;
END;
$body$
LANGUAGE &#039;plpgsql&#039;
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

CREATE TRIGGER &quot;entity_tr_ins&quot; AFTER INSERT 
ON &quot;proba&quot;.&quot;entity&quot; FOR EACH ROW 
EXECUTE PROCEDURE &quot;proba&quot;.&quot;entity_ins&quot;();

2. Deleting a node and all of its descendants - this is handled by the cascading foreign keys.
   But if you wish, it can be done like this:

   DELETE FROM helper WHERE child_id=:X OR ancestor_id=:X

3. Retrieving information:

   a) get all children of node X
      SELECT * FROM entity WHERE id IN (
        SELECT child_id FROM helper WHERE ancestor_id=:X ORDER BY depth
      ) ORDER BY title;
   b) get only immediate (direct) descendants of node X
      SELECT * FROM entity WHERE id IN (
        SELECT child_id FROM helper WHERE ancestor_id=:X AND depth=1
      ) ORDER BY title;
   c) get all ancestors of node X - from root downwards
      SELECT * FROM entity WHERE id IN (
        SELECT ancestor_id FROM helper WHERE child_id=:X ORDER BY depth
      ) ORDER BY title;
   d) get path from node A to node B
      SELECT title,depth FROM helper 
        LEFT JOIN entity ON ancestor_id=entity.id
        WHERE child_id=:B AND depth </description>
		<content:encoded><![CDATA[<p>Okay, let me explain it in more details. I will use PostgreSQL for examples,<br />
hoping that this will not prevent you from understanding my examples.<br />
First, we have this table definition:</p>
<p>CREATE TABLE &#8220;proba&#8221;.&#8221;entity&#8221; (<br />
  &#8220;id&#8221; INTEGER NOT NULL,<br />
  &#8220;title&#8221; TEXT NOT NULL,<br />
  &#8220;parent_id&#8221; INTEGER,<br />
  CONSTRAINT &#8220;entity_pkey&#8221; PRIMARY KEY(&#8220;id&#8221;),<br />
  CONSTRAINT &#8220;entity_chk&#8221; CHECK (parent_id  id),<br />
  CONSTRAINT &#8220;parent_fk&#8221; FOREIGN KEY (&#8220;parent_id&#8221;)<br />
    REFERENCES &#8220;proba&#8221;.&#8221;entity&#8221;(&#8220;id&#8221;)<br />
    ON DELETE CASCADE<br />
    ON UPDATE CASCADE<br />
    NOT DEFERRABLE<br />
) WITHOUT OIDS;</p>
<p>CREATE INDEX &#8220;parent_idx&#8221; ON &#8220;proba&#8221;.&#8221;entity&#8221;<br />
  USING btree (&#8220;parent_id&#8221;);</p>
<p>Suppose, that it is populated with the following data:</p>
<p>+&#8212;-+&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+<br />
| ID | TITLE  | PARENT_ID |<br />
+&#8212;-+&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+<br />
|  1 | RUZA   |   NULL    |<br />
|  2 | SVETA  |      1    |<br />
|  3 | SONIA  |      2    |<br />
|  4 | PAUL   |      2    |<br />
|  5 | TEDY   |      1    |<br />
|  6 | MARY   |      5    |<br />
|  7 | RUSLAN |      5    |<br />
|  8 | MITKO  |      7    |<br />
|  9 | PETER  |      7    |<br />
| 10 | MILEN  |   NULL    |<br />
| 11 | ALEX   |     10    |<br />
| 12 | ANTON  |     10    |<br />
| 13 | BOBY   |     12    |<br />
| 14 | NADIA  |     12    |<br />
+&#8212;-+&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+</p>
<p>It represents the followng tree:</p>
<p>ROOT (NULL)<br />
 |<br />
 +&#8212;&#8211; MILEN (10)<br />
 |        |<br />
 |        +&#8212;- ALEX (11)<br />
 |        |<br />
 |        +&#8212;- ANTON (12)<br />
 |                |<br />
 |                +&#8212;- BOBY (13)<br />
 |                |<br />
 |                +&#8212;- NADIA (14)<br />
 |<br />
 +&#8212;&#8211; RUZA (1)<br />
         |<br />
         +&#8212;- SVETA (2)<br />
         |       |<br />
         |       +&#8212;- SONIA (3)<br />
         |       |<br />
         |       +&#8212;- PAUL (4)<br />
         |<br />
         +&#8212;- TEDY (5)<br />
                 |<br />
                 +&#8212;- MARY (6)<br />
                 |<br />
                 +&#8212;- RUSLAN (7)<br />
                          |<br />
                          +&#8212;- MITKO (8)<br />
                          |<br />
                          +&#8212;- PETER (9)</p>
<p>This is a classic &#8220;Adjacency List&#8221; model (I prefer to call it &#8220;Parent-Child&#8221; model).<br />
Now, if we add another column to the table (let&#8217;s call it PARENT_ID_2) &#8211; we can use it<br />
to store the parents of the PARENT_ID, like this:</p>
<p>+&#8212;-+&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8212;-+<br />
| ID | TITLE  | PARENT_ID | PARENT_ID_2 |<br />
+&#8212;-+&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8212;-+<br />
|  1 | RUZA   |   NULL    |    NULL     |<br />
|  2 | SVETA  |      1    |    NULL     |<br />
|  3 | SONIA  |      2    |       1     |<br />
|  4 | PAUL   |      2    |       1     |<br />
|  5 | TEDY   |      1    |    NULL     |<br />
|  6 | MARY   |      5    |       1     |<br />
|  7 | RUSLAN |      5    |       1     |<br />
|  8 | MITKO  |      7    |       5     |<br />
|  9 | PETER  |      7    |       5     |<br />
| 10 | MILEN  |   NULL    |    NULL     |<br />
| 11 | ALEX   |     10    |    NULL     |<br />
| 12 | ANTON  |     10    |    NULL     |<br />
| 13 | BOBY   |     12    |      10     |<br />
| 14 | NADIA  |     12    |      10     |<br />
+&#8212;-+&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8212;-+</p>
<p>We can continue this with another column to store the parents of PARENT_ID_2, and so on.<br />
The problem here is, that we do not know in advance how many levels down the tree will go.<br />
Of course, we can implement code to dynamically ALTER table definition to add additional<br />
columns when needed &#8211; but this is a bad practice. The first lesson I have learned about<br />
SQL tables is:<br />
The columns should remain fixed and never be altered by the code. Instead, when we need<br />
variable number of columns &#8211; transform them into rows.<br />
This transformation is done with the help of additional table (such kind of manipulations<br />
are also used to maintain 1:1, 1:m and n:m relationships in database theory).<br />
I have defined this additional table in this way:</p>
<p>CREATE TABLE &#8220;proba&#8221;.&#8221;helper&#8221; (<br />
  &#8220;child_id&#8221; INTEGER NOT NULL,<br />
  &#8220;ancestor_id&#8221; INTEGER NOT NULL,<br />
  &#8220;depth&#8221; INTEGER NOT NULL,<br />
  CONSTRAINT &#8220;helper_idx&#8221; UNIQUE(&#8220;ancestor_id&#8221;, &#8220;child_id&#8221;),<br />
  CONSTRAINT &#8220;helper_pkey&#8221; PRIMARY KEY(&#8220;child_id&#8221;, &#8220;ancestor_id&#8221;),<br />
  CONSTRAINT &#8220;helper_chk&#8221; CHECK ((child_id  ancestor_id) AND (depth &gt; 0)),<br />
  CONSTRAINT &#8220;ancestor_fk&#8221; FOREIGN KEY (&#8220;ancestor_id&#8221;)<br />
    REFERENCES &#8220;proba&#8221;.&#8221;entity&#8221;(&#8220;id&#8221;)<br />
    ON DELETE CASCADE<br />
    ON UPDATE CASCADE<br />
    NOT DEFERRABLE,<br />
  CONSTRAINT &#8220;child_fk&#8221; FOREIGN KEY (&#8220;child_id&#8221;)<br />
    REFERENCES &#8220;proba&#8221;.&#8221;entity&#8221;(&#8220;id&#8221;)<br />
    ON DELETE CASCADE<br />
    ON UPDATE CASCADE<br />
    NOT DEFERRABLE<br />
) WITHOUT OIDS;</p>
<p>CREATE INDEX &#8220;ancestor_idx&#8221; ON &#8220;proba&#8221;.&#8221;helper&#8221;<br />
  USING btree (&#8220;ancestor_id&#8221;);</p>
<p>CREATE INDEX &#8220;child_idx&#8221; ON &#8220;proba&#8221;.&#8221;helper&#8221;<br />
  USING btree (&#8220;child_id&#8221;);</p>
<p>For the already shown above example data inside table ENTITY, the data in table HELPER shuld be:</p>
<p>+&#8212;&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8212;-+&#8212;&#8212;-+<br />
| CHILD_ID | ANCESTOR_ID | DEPTH |<br />
+&#8212;&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8212;-+&#8212;&#8212;-+<br />
|     3    |       1     |   2   |<br />
|     4    |       1     |   2   |<br />
|     6    |       1     |   2   |<br />
|     7    |       1     |   2   |<br />
|     8    |       5     |   2   |<br />
|     9    |       5     |   2   |<br />
|    13    |      10     |   2   |<br />
|    14    |      10     |   2   |<br />
|     8    |       1     |   3   |<br />
|     9    |       1     |   3   |<br />
+&#8212;&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8212;-+&#8212;&#8212;-+</p>
<p>As we see, table HELPER does not contain NULL values, and lists all indirect descendants<br />
for each node along with the distance between them (distance of 1 is for direct descendants).</p>
<p>After a short thinking, I decided to include<br />
all nodes&#8217; descendatns in table HELPER (not only indirect ones).</p>
<p>The number of rows in table HELPER is calculated by this formula:</p>
<p>  N<br />
 &#8212;&#8211;<br />
 \<br />
  \      D[i]<br />
  /<br />
 /<br />
 &#8212;&#8211;<br />
  i=1</p>
<p>N &#8211; is the number of nodes in tree (number of rows in table ENTITY)<br />
D[i] &#8211; is the number of edges from node &#8220;i&#8221; to the root</p>
<p>As far as I know, adjacency list model is very cheap for inserting and updating, but a way too<br />
expensive for retrieving and deleting. The nested sets model (and also nested intervals) is<br />
the opposite &#8211; very cheap for retrieving and deleting, but too expensive for inserting and<br />
updating. So, it seems that there are two opposite options &#8211; one model for often insertions<br />
and seldom extractions, and another model for seldom insertions and often extractions (one<br />
for the expense of the other).<br />
I think that with closure tables it is possible to hit 2 rabbits with a single bullet &#8211; to combine<br />
good ones together and to shift the poor performance to a seldom used actions.<br />
What I mean ? I still do not have any benchmarks and all the above stuff needs carefull<br />
investigation and research, but I hope that with this tree model inserting, deleting and<br />
retrieving nodes will be cheap operations, and only moving subtrees will be expensive &#8211; but<br />
not so much as in other 2 models.</p>
<p>Here are the SQL queries for all 4 possible tree manipulations:</p>
<p>1. Adding new node &#8211; this is done with an INSERT-after trigger</p>
<p>CREATE OR REPLACE FUNCTION &#8220;proba&#8221;.&#8221;entity_ins&#8221; () RETURNS trigger AS<br />
$body$<br />
BEGIN<br />
	IF new.parent_id IS NOT NULL THEN<br />
  	INSERT INTO proba.helper(child_id,ancestor_id,depth)<br />
    	SELECT new.id,new.parent_id,1 UNION ALL<br />
      SELECT new.id,ancestor_id,depth+1 FROM proba.helper WHERE child_id=new.parent_id;<br />
  END IF;<br />
  RETURN NULL;<br />
END;<br />
$body$<br />
LANGUAGE &#8216;plpgsql&#8217;<br />
VOLATILE<br />
CALLED ON NULL INPUT<br />
SECURITY INVOKER<br />
COST 100;</p>
<p>CREATE TRIGGER &#8220;entity_tr_ins&#8221; AFTER INSERT<br />
ON &#8220;proba&#8221;.&#8221;entity&#8221; FOR EACH ROW<br />
EXECUTE PROCEDURE &#8220;proba&#8221;.&#8221;entity_ins&#8221;();</p>
<p>2. Deleting a node and all of its descendants &#8211; this is handled by the cascading foreign keys.<br />
   But if you wish, it can be done like this:</p>
<p>   DELETE FROM helper WHERE child_id=:X OR ancestor_id=:X</p>
<p>3. Retrieving information:</p>
<p>   a) get all children of node X<br />
      SELECT * FROM entity WHERE id IN (<br />
        SELECT child_id FROM helper WHERE ancestor_id=:X ORDER BY depth<br />
      ) ORDER BY title;<br />
   b) get only immediate (direct) descendants of node X<br />
      SELECT * FROM entity WHERE id IN (<br />
        SELECT child_id FROM helper WHERE ancestor_id=:X AND depth=1<br />
      ) ORDER BY title;<br />
   c) get all ancestors of node X &#8211; from root downwards<br />
      SELECT * FROM entity WHERE id IN (<br />
        SELECT ancestor_id FROM helper WHERE child_id=:X ORDER BY depth<br />
      ) ORDER BY title;<br />
   d) get path from node A to node B<br />
      SELECT title,depth FROM helper<br />
        LEFT JOIN entity ON ancestor_id=entity.id<br />
        WHERE child_id=:B AND depth</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Craig Buchek</title>
		<link>http://kylecordes.com/2008/transitive-closure/comment-page-1#comment-15156</link>
		<dc:creator>Craig Buchek</dc:creator>
		<pubDate>Tue, 15 Jan 2008 18:11:38 +0000</pubDate>
		<guid isPermaLink="false">http://kylecordes.com/2008/01/13/transitive-closure/#comment-15156</guid>
		<description>Thanks for sharing this. I read your article on Celko&#039;s method, and followed up by reading the original articles by Celko. Seems like a good method in some situations, but Celko&#039;s method would seem to have  poor insert/update performance as well, due to having to update, on average, half of the records in the table.

I&#039;d be interested in reading about more techniques like these.</description>
		<content:encoded><![CDATA[<p>Thanks for sharing this. I read your article on Celko&#8217;s method, and followed up by reading the original articles by Celko. Seems like a good method in some situations, but Celko&#8217;s method would seem to have  poor insert/update performance as well, due to having to update, on average, half of the records in the table.</p>
<p>I&#8217;d be interested in reading about more techniques like these.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Miller</title>
		<link>http://kylecordes.com/2008/transitive-closure/comment-page-1#comment-15095</link>
		<dc:creator>Alex Miller</dc:creator>
		<pubDate>Mon, 14 Jan 2008 04:35:40 +0000</pubDate>
		<guid isPermaLink="false">http://kylecordes.com/2008/01/13/transitive-closure/#comment-15095</guid>
		<description>Nice.  Yet another episode in the long history of trading space for time... :)</description>
		<content:encoded><![CDATA[<p>Nice.  Yet another episode in the long history of trading space for time&#8230; <img src='http://kylecordes.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
</channel>
</rss>

