How to duplicate the behavior of predefined length/2 in SWI-Prolog?
In SWI-Prolog, the nondeterminism issue can be solved with CLP(FD)'s zcompare/3
, which reifies the inequality to a term that can be used for indexing:
:- use_module(library(clpfd)).
my_length(Ls, L) :-
zcompare(C, 0, L),
my_length(Ls, C, 0, L).
my_length([], =, L, L).
my_length([_|Ls], <, L0, L) :-
L1 #= L0 + 1,
zcompare(C, L1, L),
my_length(Ls, C, L1, L).
Your example is now deterministic (since recent versions of SWI-Prolog perform just-in-time indexing):
?- my_length(Ls, 3).
Ls = [_G356, _G420, _G484].
All serious Prolog implementations ship with CLP(FD), and it makes perfect sense to use it here. Ask your vendor to also implement zcompare/3
or a better alternative if it is not already available.
For a set of test cases, please refer to this table and to the current definition in the prologue. There are many more odd cases to consider.
Defining length/2
with var/nonvar
, is/2
and the like is not entirely trivial, because (is)/2
and arithmetic comparison is so limited. That is, they produce very frequently instantiation_error
s instead of succeeding accordingly. Just to illustrate that point: It is trivial to define length_sx/2
using successor-arithmetics.
length_sx([], 0).
length_sx([_E|Es], s(X)) :-
length_sx(Es, X).
This definition is pretty perfect. It even fails for length_sx(L, L)
. Alas, successor arithmetics is not supported efficiently. That is, an integer i requires O(i) space and not O(log i) as one would expect.
The definition I would have preferred is:
length_fd([],0).
length_fd([_E|Es], L0) :-
L0 #> 0,
L1 #= L0-1,
length_fd(Es, L1).
Which is the most direct translation. It is quite efficient with a known length, but otherwise the overhead of constraints behind shows. Also, there is this asymmetry:
?- length_fd(L,0+0).
false.
?- length_fd(L,0+1).
L = [_A]
; false.
However, your definition using library(clpfd)
is particularly elegant and efficient even for more elaborate cases.. It isn't as fast as the built-in length...
?- time(( length_fd(L,N),N=1000 )).
% 29,171,112 inferences, 4.110 CPU in 4.118 seconds (100% CPU, 7097691 Lips)
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...], N = 1000
; ... .
?- time(( my_len_clp(L,N),N=10000 )).
% 1,289,977 inferences, 0.288 CPU in 0.288 seconds (100% CPU, 4484310 Lips)
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...], N = 10000
; ... .
?- time(( length(L,N),N=10000 )).
% 30,003 inferences, 0.006 CPU in 0.006 seconds (100% CPU, 4685643 Lips)
L = [_A,_B,_C,_D,_E,_F,_G,_H,_I|...], N = 10000
; ... .
... but then it is able to handle constraints correctly:
?- N in 1..2, my_len_clp(L,N).
N = 1, L = [_A]
; N = 2, L = [_A, _B]
; false.
?- N in 1..2, length(L,N).
N = 1, L = [_A]
; N = 2, L = [_A, _B]
; loops.