Monday, 26 March 2007

Flatten Out a Heirarchy using SYS_CONNECT_BY_PATH

In a recent application, we needed to model the Organisational Hierarchy which at its most complex ran to 7 levels deep. This was achieved using a self referencing foreign key (Pigs Ear) similar to that of the EMP table in the Scott schema. In essence, it simply stores the parent / child relationship for each entry in the Hierarchy.

This approach serviced the application very well in that a simply tree walk (CONNECT BY PRIOR) allowed us to construct the Hierarchical tree and bounce our requests off that.

During the production of the Discoverer End User Layer, it became evident that this Hierarchy needed to be flattened out (un-normalized) in order for it to reported on. This is because Discoverer (or any other BI product) does not support the CONNECT BY and START WITH clause. Discoverer needs to know how many levels exist within a Hierarchy and that every thread in the Hierarchy has the same number of levels in order to build a folder structure that can be reported on.

After a bit of investigation, I happened upon the: SYS_CONNECT_BY_PATH new in 9i

This can be used to construct the "Directory Path" for all values in the EMP table i.e. we can construct the reporting path for all employees in the EMP table

For Example:

SELECT empno, ename, job, SYS_CONNECT_BY_PATH (ename, '/') || '/' chain
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr;


EMPNO ENAME JOB CHAIN
----- ------ ----- ---------------------
7839 KING PRESIDENT /KING/
7566 JONES MANAGER /KING/JONES/
7788 SCOTT ANALYST /KING/JONES/SCOTT/
7876 ADAMS CLERK /KING/JONES/SCOTT/ADAMS/
7902 FORD ANALYST /KING/JONES/FORD/
7369 SMITH CLERK /KING/JONES/FORD/SMITH/
7698 BLAKE MANAGER /KING/BLAKE/
7499 ALLEN SALESMAN /KING/BLAKE/ALLEN/
7521 WARD SALESMAN /KING/BLAKE/WARD/
7654 MARTIN SALESMAN /KING/BLAKE/MARTIN/
7844 TURNER SALESMAN /KING/BLAKE/TURNER/
7900 JAMES CLERK /KING/BLAKE/JAMES/
7782 CLARK MANAGER /KING/CLARK/
7934 MILLER CLERK /KING/CLARK/MILLER/

14 rows selected.



From here, we can simply parse the CHAIN string to work out what our Hierarchy would look like to any number of levels. The following function can be used to construct out flat Hierarchy

CREATE OR REPLACE FUNCTION parse_string (
pv_str_i IN VARCHAR2 DEFAULT NULL,
pv_first_occ IN NUMBER DEFAULT NULL,
pv_last_occ IN NUMBER DEFAULT NULL
)
RETURN VARCHAR2
IS
lv_retval VARCHAR2 (4000);
lv_str VARCHAR2 (4000);
lv_tester VARCHAR2 (4000);
lv_substr_pos1 NUMBER;
lv_substr_pos2 NUMBER;
BEGIN
SELECT INSTR (pv_str_i, '/', 1, pv_first_occ)
INTO lv_substr_pos1
FROM DUAL;

SELECT INSTR (pv_str_i, '/', 1, pv_last_occ)
INTO lv_substr_pos2
FROM DUAL;

lv_tester :=
SUBSTR (pv_str_i, lv_substr_pos1 + 1,
(lv_substr_pos2 - lv_substr_pos1));

lv_str := REPLACE (lv_tester, '/', NULL);

RETURN lv_str;
END parse_string;
/

If we wanted to then create a flat hierarchy to 3 levels deep, we can simply then write a query like follows:

SELECT chain,
NVL (parse_string (chain, 1, 2), '<- ' || ename) level1,
NVL (parse_string (chain, 2, 3), '<- ' || ename) level2,
NVL (parse_string (chain, 3, 4), '<- ' || ename) level3
FROM (SELECT empno, ename, job,
SYS_CONNECT_BY_PATH (ename, '/') || '/' chain
FROM emp
START WITH mgr IS NULL
CONNECT BY PRIOR empno = mgr);

CHAIN LEV1 LEV2 LEV3
---------------------- ---- ---- ----
/KING/ KING <- KING <- KING
/KING/JONES/ KING JONES <- JONES
/KING/JONES/SCOTT/ KING JONES SCOTT
/KING/JONES/SCOTT/ADAMS/ KING JONES SCOTT
/KING/JONES/FORD/ KING JONES FORD
/KING/JONES/FORD/SMITH/ KING JONES FORD
/KING/BLAKE/ KING BLAKE <- BLAKE
/KING/BLAKE/ALLEN/ KING BLAKE ALLEN
/KING/BLAKE/WARD/ KING BLAKE WARD
/KING/BLAKE/MARTIN/ KING BLAKE MARTIN
/KING/BLAKE/TURNER/ KING BLAKE TURNER
/KING/BLAKE/JAMES/ KING BLAKE JAMES
/KING/CLARK/ KING CLARK <- CLARK
/KING/CLARK/MILLER/ KING CLARK MILLER

14 rows selected.


And now we have a table that can be used to report on with BI products such as Discoverer

Tuesday, 20 March 2007

Translate Columns into Rows (Subquery Factoring)

It has always been a fairly rudimentary task pivoting a result set so that rows are displayed as columns. More recently I had the requirement to translate a result set the other way so that the columns would be displayed as rows.

For Example

Your result set starts out like this:

SITE COST1 COST2 COST3 COST4
------------------------------------------------------------------
SITE_ONE 2000 255
SITE_TWO 100
SITE_THREE 145 5000

The desired output should look like this:

SITE VALUE
------------------------------
SITE_ONE 2000
SITE_ONE 255
SITE_TWO 100
SITE_THREE 145
SITE_THREE 5000

Thanks to the help from Mr. Tom Kyte at http://asktom.oracle.com I was able to produce the desired results using the following query:

WITH data AS
(SELECT LEVEL l
FROM dual CONNECT BY LEVEL <= 4) SELECT site, decode(l, 1, cost1, 2, cost2, 3, cost3, 4, cost4) cost FROM data, costs WHERE decode(l, 1, cost1, 2, cost2, 3, cost3, 4, cost4) IS NOT NULL ORDER BY 1

It uses something called Subquery Factoring which allows you to create a pseudo table with N rows (CONNECT BY LEVEL <= N) in your query. You then cartisian join to your other tables(s) and use a simple DECODE to extract the column you are interested in.

Note that this technique will only work in 9i and above but I have had great success in the past with it The thread on asktom can be found here