Search code examples
sqlhierarchical-datagreenplum

SQL query: Finding connected contacts until third degree of depth


I am using PostgreSQL 8.2.15 (Greenplum Database 4.3.3.1 build 1), which means that usage of WITH RECURSIVE is not supported.

There is a sample table:

select * from reports_Table

reporter spammer
 AAA    BBB
 AAA    CCC
 DDD    CCC
 DDD    BBB
 DDD    EEE
 DDD    FFF
 EEE    DDD
 CCC    AAA
 FFF    DDD
 BBB    AAA
 BBB    CCC
 BBB    DDD

Through sql, I am trying to get the list of all reporters as well as spammers connected to AAA, until third degree of depth. In the example above, the result from the query would be:

AAA  
BBB  
CCC  
DDD  
FFF  
EEE  

BBB and CCC are connected directly with AAA therefore 1st degree connection of AAA,
DDD is 2nd degree connection of AAA since it is connected through CCC,
FFF and EEE are 3rd degree connection of AAA since both of them are connected through DDD.

I have managed to get at this point in the query which I think logically it works, but can't go further since can’t seem to understand the occurring syntax error:

ERROR: syntax error at or near "WHILE" 

Most probably the version I am using needs a different syntax for WHILE LOOP, but I can't seem to fix it.

/*supported tables*/
CREATE TEMP TABLE variables as
select 1 as first_column, 'AAA'::text as specific_reporter, 3 as degreeNumber

CREATE TEMP TABLE CollectedReporters(
specific_reporter text
);
GO
INSERT INTO CollectedReporters 
select specific_reporter from variables;

/*main query*/
BEGIN
    WHILE (select degreeNumber from variables) >= 1 LOOP
        INSERT INTO CollectedReporters
            SELECT ct.spammer::text as specific_reporter 
                FROM reports_Table ct
                    INNER JOIN CollectedReporters cc ON ct.reporter = cc.specific_reporter::text
                    LEFT JOIN CollectedReporters cc2 ON ct.spammer = cc2.specific_reporter::text
                WHERE cc2.specific_reporter IS NULL;

        UPDATE variables
            SET degreeNumber = degreeNumber - 1;
    END WHILE;
END;

SELECT * FROM  CollectedReporters

Any help is greatly appreciated!


Solution

  • The issue here is that you don't understand the database syntax, I'd recommend you to cover the guide on PG, it would make the things clearer for you

    • First, you can use BEGIN-END construct only inside of the PL/pgSQL functions as it is part of PL/pgSQL language syntax. When used outside of the function, "BEGIN" means beginning of transaction and should be used appropriately
    • When you are running some query through your GUI tool, each query should be the correct ANSI SQL query. "while" is not part of ANSI SQL. However, it is part of T-SQL in MSSQL, PL/SQL in Oracle and PL/pgSQL in Postgres
    • To use PL/pgSQL in Greenplum version 4.3 and earlier, you have only one option - create a function. In Greenplum version 5.0+ (dev builds from github), you can also use anonymous code blocks in PL/pgSQL

    Here is an example of how this might look like:

    CREATE OR REPLACE FUNCTION my_function (maxlevel int) returns void as $BODY$ 
    DECLARE
        level int = 1;
    BEGIN
    
        TRUNCATE CollectedReporters;
    
        WHILE (level <= maxlevel) LOOP
            RAISE NOTICE 'Processing level %', level;
    
            INSERT INTO CollectedReporters
                SELECT ct.spammer::text as specific_reporter 
                    FROM reports_Table ct
                        INNER JOIN CollectedReporters cc ON ct.reporter = cc.specific_reporter::text
                        LEFT JOIN CollectedReporters cc2 ON ct.spammer = cc2.specific_reporter::text
                    WHERE cc2.specific_reporter IS NULL;
    
            level = level + 1;
        END LOOP;
    END;
    $BODY$ LANGUAGE PLPGSQL VOLATILE;