Flatten Out a Heirarchy using SYS_CONNECT_BY

Flatten Out a Heirarchy using SYS_CONNECT_BY_PATH

March 26, 2007

In a recent application, we needed to model the Organisational Hierarchy which at its most complex ran to 7 levels deep. This was achieved using a self referencing foreign key (Pigs Ear) similar to that of the EMP table in the Scott schema. In essence, it simply stores the parent / child relationship for each entry in the Hierarchy.

This approach serviced the application very well in that a simply tree walk (CONNECT BY PRIOR) allowed us to construct the Hierarchical tree and bounce our requests off that.

During the production of the Discoverer End User Layer, it became evident that this Hierarchy needed to be flattened out (un-normalized) in order for it to reported on. This is because Discoverer (or any other BI product) does not support the CONNECT BY and START WITH clause. Discoverer needs to know how many levels exist within a Hierarchy and that every thread in the Hierarchy has the same number of levels in order to build a folder structure that can be reported on.

After a bit of investigation, I happened upon the: SYS_CONNECT_BY_PATH new in 9i

This can be used to construct the "Directory Path" for all values in the EMP table i.e. we can construct the reporting path for all employees in the EMP table

For Example:

SELECT empno, ename, job, SYS_CONNECT_BY_PATH (ename, '/') || '/' chain
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr;


EMPNO ENAME   JOB       CHAIN
----- ------  -----     ---------------------
 7839 KING    PRESIDENT /KING/
 7566 JONES   MANAGER   /KING/JONES/
 7788 SCOTT   ANALYST   /KING/JONES/SCOTT/
 7876 ADAMS   CLERK     /KING/JONES/SCOTT/ADAMS/
 7902 FORD    ANALYST   /KING/JONES/FORD/
 7369 SMITH   CLERK     /KING/JONES/FORD/SMITH/
 7698 BLAKE   MANAGER   /KING/BLAKE/
 7499 ALLEN   SALESMAN  /KING/BLAKE/ALLEN/
 7521 WARD    SALESMAN  /KING/BLAKE/WARD/
 7654 MARTIN  SALESMAN  /KING/BLAKE/MARTIN/
 7844 TURNER  SALESMAN  /KING/BLAKE/TURNER/
 7900 JAMES   CLERK     /KING/BLAKE/JAMES/
 7782 CLARK   MANAGER   /KING/CLARK/
 7934 MILLER  CLERK     /KING/CLARK/MILLER/

14 rows selected.

From here, we can simply parse the CHAIN string to work out what our Hierarchy would look like to any number of levels. The following function can be used to construct out flat Hierarchy

CREATE OR REPLACE FUNCTION parse_string (
pv_str_i IN VARCHAR2 DEFAULT NULL,
pv_first_occ IN NUMBER DEFAULT NULL,
pv_last_occ IN NUMBER DEFAULT NULL
)
RETURN VARCHAR2
IS
lv_retval VARCHAR2 (4000);
lv_str VARCHAR2 (4000);
lv_tester VARCHAR2 (4000);
lv_substr_pos1 NUMBER;
lv_substr_pos2 NUMBER;
BEGIN
SELECT INSTR (pv_str_i, '/', 1, pv_first_occ)
INTO lv_substr_pos1
FROM DUAL;

SELECT INSTR (pv_str_i, '/', 1, pv_last_occ)
INTO lv_substr_pos2
FROM DUAL;

lv_tester :=
SUBSTR (pv_str_i, lv_substr_pos1 + 1,
(lv_substr_pos2 - lv_substr_pos1));

lv_str := REPLACE (lv_tester, '/', NULL);

RETURN lv_str;
END parse_string;
/

If we wanted to then create a flat hierarchy to 3 levels deep, we can simply then write a query like follows:

SELECT chain,
NVL (parse_string (chain, 1, 2), '<- ' || ename) level1,
NVL (parse_string (chain, 2, 3), '<- ' || ename) level2,
NVL (parse_string (chain, 3, 4), '<- ' || ename) level3
FROM (SELECT empno, ename, job,
SYS_CONNECT_BY_PATH (ename, '/') || '/' chain
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr);


CHAIN                    LEV1  LEV2    LEV3
----------------------   ----  ----    ----
/KING/                   KING  <- KING <- KING 
/KING/JONES/             KING  JONES   <- JONES 
/KING/JONES/SCOTT/       KING  JONES   SCOTT 
/KING/JONES/SCOTT/ADAMS/ KING  JONES   SCOTT 
/KING/JONES/FORD/        KING  JONES   FORD 
/KING/JONES/FORD/SMITH/  KING  JONES   FORD 
/KING/BLAKE/             KING  BLAKE   <- BLAKE 
/KING/BLAKE/ALLEN/       KING  BLAKE   ALLEN 
/KING/BLAKE/WARD/        KING  BLAKE   WARD 
/KING/BLAKE/MARTIN/      KING  BLAKE   MARTIN 
/KING/BLAKE/TURNER/      KING  BLAKE   TURNER 
/KING/BLAKE/JAMES/       KING  BLAKE   JAMES 
/KING/CLARK/             KING  CLARK   <- CLARK 
/KING/CLARK/MILLER/      KING  CLARK   MILLER 

14 rows selected.

And now we have a table that can be used to report on with BI products such as Discoverer

Comments

Anonymous said…

Thanks for this, it was very helpful!!

2 August 2007 at 13:58

Anonymous said…

Just hunting for a similar solution myself and happened on the code. Gives me some good ideas so thanks..

That said I thought I'd leave you a performance related note about the two queries inside your function:

SELECT INSTR (pv_str_i, '/', 1, pv_first_occ)
INTO lv_substr_pos1
FROM DUAL;

SELECT INSTR (pv_str_i, '/', 1, pv_last_occ)
INTO lv_substr_pos2
FROM DUAL;

Using "select from dual" to reference INSTR is not really necessary as INSTR is a native PL/SQL function and can be called directly within the block.

This will save two parses for each call to the function. Not much if you're running it on EMP but quite a bit if you're referencing large tables and calling it multiple times per record.. :-)

(Plus it's a good habit not to select from dual unless you have to :-> )

Cheers,
Steve.

9 January 2008 at 02:03

Duncan said…

Thanks for the comments Steve. I see what you are saying and your are correct. This example was knocked together fairly quickly so I don't doubt that there is room for improvement. I will get it changed once I get a couple of mins.

Regards

Duncan

9 January 2008 at 08:21

Anonymous said…

Fantastic solution to a problem that I have been using PhP to resolve for ages.

25 July 2008 at 16:00

Search This Blog

Duncan Mein's Blog

Flatten Out a Heirarchy using SYS_CONNECT_BY_PATH

Comments

Popular posts from this blog

Custom Authentication / Authorisation Schemes (Part 1)

Mac OSX, Bootcamp and a Missing Hash Key

Add / Delete a row from a SQL based Tabular Form (Static ID)