Prolog - Do plain rules have better performance than lists?

I have a set of DCG rules (in this case german personal pronouns):

% personal pronoun (person, case, number, genus)
ppers(1,0,sg,_) --> [ich].
ppers(1,1,sg,_) --> [meiner].
ppers(1,2,sg,_) --> [mir].
ppers(1,3,sg,_) --> [mich].
ppers(2,0,sg,_) --> [du].
ppers(2,1,sg,_) --> [deiner].
ppers(2,2,sg,_) --> [dir].
ppers(2,3,sg,_) --> [dich].
...

Because they are semantically connected, it would make sense to me to keep this information by moving them into a list (grouped by person for example) instead of unrelated rules. This also makes things a bit neater:

ppers(1,sg,_,[ich, meiner, mir, mich]).
ppers(2,sg,_,[du,deiner,dir,dich]).
...

I would then select the item I want with nth0() where the case I need is the index within the list.

However, I noticed when tracing through the program, that when checking a german sentence for correct grammar and trying to find if a part is a personal pronoun, Prolog will not step through every instanc when I use the upper version (plain rules), but will crawl through every list when I use the list version below.

Does this mean that performance will be worse if I use lists and nth0 versus plain rules? Or does the Prolog tracer just not show the crawling for plain rules as it does for lists?

(I hope I could make my question obvious enough, if not I will expand.)

Solution

Most probably the speed and tracing difference is not caused by indexing (*), but by the speed and tracing difference between clause head unification and body call nth. If you really want to take advantage of indexing and want to be portable (**) across most Prolog systems, you would need to reformulate your problem for first argument indexing.

One way to do this, is via an additional predicate. Supposed you have originally these DCG rules:

cat(Attr1, .., Attrn) --> [Terminal1, .., Terminaln].
..

Transform this into:

cat(X1, .., Xn) --> [Y], cat2(Y, X1, .., Xn).

cat2(Terminal1, Attr1, .., Attrn) --> [Terminal2, .., Terminaln].
..

When we apply this to your example we would get:

% personal pronoun (person, case, number, genus)
ppers(X1,X2,X3,X4) --> [Y], ppers2(Y,X1,X2,X3,X4).

% personal pronoun 2 (first word, person, case, number, genus)
ppers2(ich,1,0,sg,_) --> [].
ppers2(meiner,1,1,sg,_) --> [].
ppers2(mir,1,2,sg,_) --> [].
ppers2(mich,1,3,sg,_) --> [].
ppers2(du,2,0,sg,_) --> [].
ppers2(deiner,2,1,sg,_) --> [].
ppers2(dir,2,2,sg,_) --> [].
ppers2(dich,2,3,sg,_) --> [].

You can do this for each category you have in your code and that is kind of a lexicon table. The above works independent on how DCGs are translated and if first argument indexing is present, it will be lightning fast.

Bye

(*) Even if your Prolog system can do multi argument indexing, it might still not do complex term indexing. To index a [ich|X] the Prolog system would need to decend into the list, but most probably it does not decend and does only index (.)/2, so that all clauses look the same and indexing has no positive effect.

(**) I guess the only common denominator among Prolog systems what concerns indexing is first argument indexing. Besides that not all Prolog systems may put a terminal into the head. Some might use =/2 in the body and some might use 'C'/3 in the body. DCGs are currently not standardized what concerns the modelling of terminals.