Search code examples
prologeclipse-clp

How to read large Prolog files in ECLiPSe?


I generated a large file of paths via eclipse. Each line contains a clause for a list of 27 points.

$ wc -l snake_points.pl
240917 snake_points.pl

$ ls -lh snake_points.pl
-rw-rw-r-- 1 carl carl 72M Sep  6 02:39 snake_points.pl

$ head -n 1 snake_points.pl 
snake_points([(2, 0, 0), (2, 0, 1), (2, 0, 2), (2, 1, 2), (2, 1, 1), (2, 1, 0), (2, 2, 0), (2, 2, 1), (2, 2, 2), (1, 2, 2), (0, 2, 2), (0, 1, 2), (0, 0, 2), (0, 0, 1), (0, 1, 1), (0, 1, 0), (0, 2, 0), (0, 2, 1), (1, 2, 1), (1, 2, 0), (1, 1, 0), (1, 1, 1), (1, 1, 2), (1, 0, 2), (1, 0, 1), (1, 0, 0), (0, 0, 0)]).

However, I am unable to load the file into memory (even with 8G of heap):

$ time eclipse -f snake_points.ecl -e 'halt.'
*** Overflow of the global/trail stack in spite of garbage collection!
You can use the "-g kBytes" (GLOBALSIZE) option to have a larger stack.
Peak sizes were: global stack 8388576 kbytes, trail stack 59904 kbytes

________________________________________________________
Executed in  128.05 secs   fish           external 
   usr time  122.92 secs  297.00 micros  122.92 secs 
   sys time    5.01 secs   37.00 micros    5.01 secs 

Compare this to swipl:

$ time swipl -f snake_points.pl -g 'halt.'

________________________________________________________
Executed in   53.56 secs   fish           external 
   usr time   53.27 secs  272.00 micros   53.27 secs 
   sys time    0.28 secs   41.00 micros    0.28 secs 

Neither are impressive, but I'd expect ECLiPSe to complete with a reasonable amount of memory.

Is this expected behavior? What can be done?

I understand the solution may be "use a database" or EXDR, but shouldn't this be able to be done efficiently?


Solution

  • The problem is that you are not only reading the data, you are trying to compile it as a single predicate with 240917 clauses, and the compiler is indeed not built for this kind of usage.

    You can instead read and assert the clauses from the data file one-by-one, like this:

    assert_from_file(File) :-
            open(File, read, S),
            repeat,
                read(S, Term),
                ( Term == end_of_file ->
                    true
                ;
                    assert(Term),
                    fail
                ),
            !, close(S).
    

    This loads your data in finite time

    ?- assert_from_file("snake_points.pl").
    Yes (19.38s cpu)
    

    and you can then call the resulting predicate as expected

    ?- snake_points(X).
    X = [(2, 0, 0), (2, 0, 1), (2, 0, 2), (2, 1, 2), (2, 1, 1), (2, 1, 0), (2, 2, 0), (2, 2, 1), (2, 2, 2), (1, 2, 2), (0, 2, 2), (0, 1, 2), (0, 0, 2), (0, 0, 1), (0, 1, 1), (0, 1, 0), (0, 2, 0), (0, ..., ...), (..., ...), ...]
    Yes (0.04s cpu, solution 1, maybe more) ? ;
    

    But whatever problem you are trying to solve, this doesn't look like the most promising approach...