|
Home > Archive > PostgreSQL Administration > October 2006 > Re: Recursive use
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Jay A. Kreibich 2006-10-25, 8:24 am |
| On Wed, Oct 04, 2006 at 10:08:10AM -0300, Alexander Burbello scratched on the wall:
> Hi people,
>
> I need to know if Postgres do recursive search and how can I do!
> I will explain my problem.
>
>
> table COOPERATIVE
> code_cooperative int
> code_coo_father int
>
> I can have 3 level by business rules
>
> 1 - Father
> ----- 2 - Children
> --------- 3 - Grandchildren
>
>
> I would like to have a query asking who is father and granfather
> select grandfather, father from COOPERATIVE where COD_COOPERATIVE = 3
>
> Do the Postgres can solve this problem?
> Could anybody help me?
These are generally referred to as "Hierarchical Queries" and center
around the idea of a self-referencing table (such as an employee
table with a "manager" field that is a FK to another row in the same
table). This essentially makes a tree-like structure.
Oracle supports these types of queries with their "START WITH ...
CONNECT BY" extensions to SELECT. In Oracle, hierarchical queries
also return a pseudo-column called "LEVEL" that is the depth of a node
in the tree. The syntax is fairly complete and allows all kinds of
queries up and down the tree. It is extremely useful for dealing with
self-referencing tables.
Alas, PostgreSQL does not support a similar set of extensions.
Although self-referencing tables are a bit of a design niche, they
show up all then time when translating traditional computer memory
structures and object trees into RDBMS storage systems. It would be
really cool if "START WITH ... CONNECT BY" or some similar set of
extensions was found in PostgreSQL.
As pointed out by others, the most general way to deal with this in
PostgreSQL is to write PL/PgSQL (or some other language) functions
that can generate the specific queries you need. It isn't always
pretty, but it can be made to work for a specific set of queries.
If you have a known structure (like the fact that your tree is
never any more than three levels deep) you can also join the table
to itself multiple times. This can get really confusing very quickly,
and is not an overly general solution, but it can be done in "pure" SQL
in a fairly straight forward (if not a bit complex) kind of way.
-j
--
Jay A. Kreibich | CommTech, Emrg Net Tech Svcs
jak@uiuc.edu | Campus IT & Edu Svcs
<http://www.uiuc.edu/~jak> | University of Illinois at U/C
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
| |
| Jim C. Nasby 2006-10-25, 8:24 am |
| On Fri, Oct 06, 2006 at 10:37:26AM -0500, Jay A. Kreibich wrote:
> These are generally referred to as "Hierarchical Queries" and center
> around the idea of a self-referencing table (such as an employee
> table with a "manager" field that is a FK to another row in the same
> table). This essentially makes a tree-like structure.
<snip>
> As pointed out by others, the most general way to deal with this in
> PostgreSQL is to write PL/PgSQL (or some other language) functions
> that can generate the specific queries you need. It isn't always
> pretty, but it can be made to work for a specific set of queries.
There are also other ways to represent this type of information without
using hierarchical queries. Joe Celko presents two methods in SQL For
Smarties.
There's also the ltree module in contrib that might be of some use.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
| |
| Jay A. Kreibich 2006-10-25, 8:24 am |
| On Tue, Oct 10, 2006 at 10:15:42AM -0500, Jim C. Nasby scratched on the wall:
> On Fri, Oct 06, 2006 at 10:37:26AM -0500, Jay A. Kreibich wrote:
> <snip>
>
> There are also other ways to represent this type of information without
> using hierarchical queries. Joe Celko presents two methods in SQL For
> Smarties.
If you're referring to Joe's March 1996 DBMS article,
(http://www.dbmsmag.com/9603d06.html) he does demonstrate two models,
but one of them is the self-referencing table model where one column
references another column in the same table. His only suggestion for
dealing with these kinds of tables is self-joins (which I also
mentioned) but points out the obvious limitation that-- unless you go
procedural-- you have to know how many levels you're going to process
before you setup the query.
The other model that is shown (which he calls "nested-set") is
interesting, but has a lot of properties that make me uncomfortable.
(He proposes each node/row have two sequence counters ("left" and "right")
represent pre- and post-visit order in a depth-first traversal; sets
can be calculated by differences or betweens of the two values).
For one, the table requires an extreme amount of maintenance-- something
as simple as inserting a single leaf node may require updating every
row in the whole table. On average, more than half the nodes/rows will
require updating for each record insertion and removal, but it isn't clear
how this update process would work (since the sequences require a
traversal to update, but a proper traversal requires the correct
sequences). There are tricks for the simple cases, but I'm not sure
you could do an update in-place in the general case.
The representation he's chosen also introduces an ordering among siblings--
while this is a required attribute of some tree structures, in most
cases (and in the spirit of general SQL sets) the ordering of peer
nodes/rows is undefined and unimportant. This isn't exactly a flaw,
so much as an unexpected side-effect.
In theory, I agree with his assertion that a conceptual "nested sets"
approach is more SQLish (since SQL likes to deal with sets), but I don't
think the implementation he presented actually has anything to do with
sets (in the traditional sense) that are nested. The whole thing depends
on understanding traversal orderings and some of the tricks you can play
with that to indirectly define sets. I guess it all depends on how you
look at it. I personally tend to think more in C++ than SQL anyways.
I also noticed that Joe has a book out titled "Joe Celko's Trees and
Hierarchies in SQL for Smarties". I have not yet had a chance to
review this book (other than the on-line table of contents) but it
looks interesting. While much of this is on graphs and more general
edge/node structures, a fair bit of the book appears to be about this
type of tree structure. He goes into more detail on some of these
issues, such as insertion and deletion times, and tricks to play for
inserting whole sub-trees, and that kind of thing. Maybe the book
would sell the so-called "nested-set" implementation a bit better,
but it still strikes me as a solution for warehouses, not OLTP style
stuff. I might have to find this book and have a closer read.
Thanks for the reference.
> There's also the ltree module in contrib that might be of some use.
Interesting.
-j
--
Jay A. Kreibich | CommTech, Emrg Net Tech Svcs
jak@uiuc.edu | Campus IT & Edu Svcs
<http://www.uiuc.edu/~jak> | University of Illinois at U/C
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
| |
| Jim C. Nasby 2006-10-25, 8:24 am |
| On Tue, Oct 10, 2006 at 04:33:46PM -0500, Jay A. Kreibich wrote:
> On Tue, Oct 10, 2006 at 10:15:42AM -0500, Jim C. Nasby scratched on the wall:
>
> If you're referring to Joe's March 1996 DBMS article,
> (http://www.dbmsmag.com/9603d06.html) he does demonstrate two models,
Nope... I'm talking about his book SQL For Smarties
> For one, the table requires an extreme amount of maintenance-- something
> as simple as inserting a single leaf node may require updating every
> row in the whole table. On average, more than half the nodes/rows will
> require updating for each record insertion and removal, but it isn't clear
> how this update process would work (since the sequences require a
> traversal to update, but a proper traversal requires the correct
> sequences). There are tricks for the simple cases, but I'm not sure
> you could do an update in-place in the general case.
The book goes into great detail about how to handle inserts and deletes.
Something he doesn't talk about that I think would be of use is putting
intentional gaps in the nested set. That would greatly reduce the
overhead of many operations (at the cost of added code complexity). For
example, if instead of incrementing each sequence by one, you
incremented by 10, you could insert 9 new values into one location in
the tree before you'd have to worry about moving other numbers around.
> I also noticed that Joe has a book out titled "Joe Celko's Trees and
> Hierarchies in SQL for Smarties". I have not yet had a chance to
> review this book (other than the on-line table of contents) but it
> looks interesting. While much of this is on graphs and more general
> edge/node structures, a fair bit of the book appears to be about this
> type of tree structure. He goes into more detail on some of these
> issues, such as insertion and deletion times, and tricks to play for
> inserting whole sub-trees, and that kind of thing. Maybe the book
> would sell the so-called "nested-set" implementation a bit better,
> but it still strikes me as a solution for warehouses, not OLTP style
> stuff. I might have to find this book and have a closer read.
I'd be curious to know if that book is just an out-take of SQL For
Smarties, or if he adds new information. I can tell you that Smarties is
a book every serious database architect should have.
--
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
|
|
|
|
|