Prolog programming guidelines
This is a set of reasonable guidelines for formatting Prolog programs, freely adapted from Caml guidelines and M. Covington’s Prolog coding guidelines. They aim at reflecting the consensus among the programmers of the Lifeware team.
Nevertheless, all detailed notifications of possible errors or omissions will be noted with pleasure. To send your comments: Sylvain.Soliman@inria.fr.
Thanks to all those who have already participated in the critique of this page: Thierry Martinez and Julien Martin
Program formatting guidelines
Table of Contents
Lexical conventions
Spaces should surround all operator symbols, with the notable exception of the comma. It has been a great step forward in typography to separate words by spaces to make written texts easier to read. Do the same in your programs if you want them to be readable.
How to write terms and pairs
A tuple is parenthesized and the commas therein (delimiters) are each followed by a space, same for a term.
(1, 2)
append(List1, List2, List12)
How to write lists
Write [H | T]
with spaces around the |
(since |
is considered as
an infix operator, hence surrounded by spaces) and [1, 2, 3]
(since
,
is a delimiter, hence followed by a space). The empty list is
written without spaces []
.
How to write operator symbols
Be careful to keep operator symbols well separated by spaces: not only will your formulas be more readable, but you will avoid confusion with multi-character operators.
Prefix operators (+
, -
or \
) are not separated from their
arguments.
I is X + 1
J is +2 ** -1
Criticism
The absence of spaces around an operator improves the readability of
formulas when you use it to reflect the relative precedences of
operators. For example X*Y + 2*Z
makes it very obvious that
multiplication takes precedence over addition.
Response
This is a bad idea, a chimera, because nothing in the language ensures
that the spaces properly reflect the meaning of the formula. For example
X * Z-1
means (X * Z) - 1
, and not X * (Z - 1)
as the proposed
interpretation of spaces would seem to suggest. Besides, the problem of
multi-character symbols would keep you from using this convention in a
uniform way: you could not leave out the spaces around the
multiplication to write 3*-2
. Finally, this playing with the spaces is
a subtle and flimsy convention, a subliminal message which is difficult
to grasp on reading. If you want to make the precedences obvious, use
the expressive means brought to you by the language: write parentheses.
How to write long character strings and atoms
Long character strings and atoms can be split by indicating the string
continuation character \
at the end of the line. However since white
spaces on the beginning of next line will be included in the
string/atom, the next line cannot be properly indented. Strings/atoms
that cannot be built from smaller parts should thus be kept on a single
line. This is the only exception to the 80 columns limit (see
below).
When to use parentheses within an expression
Parentheses are meaningful: they indicate the necessity of using an unusual precedence. So they should be used wisely and not sprinkled randomly throughout programs. However Prolog precedences can be tricky, so prefer over-parenthesizing to ambiguity.
Arithmetic operators: the same rules as in mathematics
For example 1 + 2 * X
means 1 + (2 * X)
.
Boolean operators: be careful
/\
and \/
have the same precedence! For example true \/ false /\ X
means (true \/ false) /\ X
.
And with FD operators things change again: #/\
is left associative and
has precedence 720, whereas #\/
is right associative and has
precedence 730.
How to write sequences of program clauses
Skip one line between clauses of a single predicate, and skip two line between clauses of different predicates.
Indentation of programs
Width of the page
The page is 80 columns wide.
Justification
This width makes it possible to read the code on all displays and to print it in a legible font on a standard sheet.
Height of the page
A clause should always fit within one screen (of about 30 lines), or in exceptional cases two.
Justification
When a clause goes beyond one screen, it’s time to divide it into subproblems and handle them independently. Beyond a screen, one gets lost in the code. The indentation is not readable and is difficult to keep correct. Use auxiliary predicates if necessary.
What to indent
All lines except the first one of a clause should be indented. This line
should only contain the head and :-
or .
. There should be only one
subgoal on each line (even for !
and nl
).
When it is necessary to delimit syntactic blocks in programs, use as delimiters isolated parentheses and indent the block content. Parentheses starting a block should be left alone, the block should be indented, and the closing parenthesis is aligned with the opening one.
head :-
goal1,
goal2,
(
goal3
;
goal4
;
goal5
).
DCGs are indented in the same way with -->
behaving as :-
, and {}
as a block delimiter.
p(X, Y) -->
q(X),
r(X, Y),
{
s(Y, Z)
},
t(Z).
Justification
The first line of the definition is set off nicely, so it’s easier to pass from definition to definition. The blocks are also clearly marked and thus stand out.
How much to indent
The change in indentation between successive blocks of the program is of 3 spaces. Both spaces and tab stops (ASCII character 9) have advantages, however hard spaces have been chosen (for instance to ease inclusion of source files into LaTeX or HTML documents). If necessary use your editor to transform tabs into spaces.
How to indent conditionals
The conditional construct (g1 -> g2 ; g3)
should be used with
parsimony and use parentheses for sub-blocks if necessary. Apart from
that, all the symbols above (parentheses, ->
and ;
should be aligned
and the goals in blocks.
p :-
(
g1
->
g2
;
g3
),
...
Or for a more complex case:
p :-
(
(
cond1
;
cond2
;
cond3
)
->
then1,
then2
;
else1,
else2
),
...
Note that a ;
construct without ->
(or the opposite) is just a
special case of the conditional.
It is not necessary to add extraneous parentheses for successive
conditionals, the aligned ->
already remind of the usual if
and
elsif
constructs.
Comparison
The O’Keefe convention used in “The Craft of Prolog” would lead to this:
head :-
goal1,
goal2,
( goal31,
goal32
; goal33,
goal34
),
( goal41 ->
goal42,
goal43
; goal44 ->
goal45
; goal46
).
And the Peter Ludeman version, as exposed in the ALP newsletter in 95, to that:
head :-
goal1,
goal2,
( goal31
-> goal32
; goal33,
goal34
),
( goal41
-> goal42,
goal43
; goal44
-> goal45
; goal46
).
Whereas we choose:
head :-
goal1,
goal2,
(
goal31
->
goal32
;
goal33,
goal34
),
(
goal41
->
goal42,
goal43
;
goal44
->
goal45
;
goal46
).
We prefer to lose some space but have clearly identified blocks and block separators.
How to indent repeat
The repeat
construct is not separated from the rest of the code, in
order to make it stand out, always create a block (with parentheses)
even if it is not strictly necessary. The simplest way to enforce this
and to make clearer the scope of the cut ending such a construct is to
always finish it with a ->
.
Do not write:
process_queries :-
repeat,
read_query(Q),
handle_query(Q),
Q = 'quit',
!,
write('Ciao...\n').
But instead write:
process_queries :-
(
repeat,
read_query(Q),
handle_query(Q),
Q = 'quit'
->
write('Ciao...\n')
).
How to indent for
The same reasoning as above applies, leading to:
p :-
write('begin\n'),
(
for(I, 1, 100),
J is I + 1,
write(J),
fail
;
write('\nend\n')
).
How to indent \+
Since the negation \+
imposes one level of parentheses for multiple
predicates we obtain:
p :-
write('begin\n'),
\+ singlepredicate(X, Y),
\+ (
p(X),
q(Y)
),
write('end\n').
How to indent long terms
Even with only one subgoal per line, and trying to avoid predicates with too many arguments, some terms might need to be cut in order to respect the 80 columns rule. In that case they should be cut either at a parenthesis, opening a block, with the follow-up indented and at most one argument per line (or if impossible after an operator).
p :-
catch(
(
goal1(
argument1,
argument2,
argument3
),
goal2,
goal3
),
catcher,
(
recovery1,
recovery2
)
),
...
q(Result) :-
...
Result is A + B + C + D + E + F + G +
H + I + J + K.
In the last example, introducing intermediary variables would be better.
How to comment programs
Do not hesitate to comment when there’s a difficulty And conversely, if there’s no difficulty, there’s no point in commenting.
Avoid comments in the bodies of clauses
When possible, prefer one comment at the beginning of the predicate which explains how the algorithm that is used works. If a comment is inserted in the body, put it just before the concerned code (line or block).
Avoid vacuous comments
A vacuous comment is a comment that does not add any value, i.e. no non-trivial information. The vacuous comment is evidently of no interest; it is a nuisance since it uselessly distracts the reader. It is often used to fulfill some strange criteria related to the so-called software metrology, for instance the ratio number of comments / number of lines of code that measures something of no theoretical or practical interpretation. Absolutely avoid vacuous comments.
Thus avoid:
factorial(0, 1). % Factorial of 0 is 1.
factorial(N, FactN) :-
N > 0, % N is positive
Nminus1 is N - 1, % Calculate N minus 1
factorial(Nminus1, FactNminus1), % recursion
FactN is N * FactNminus1. % N! = N * (N - 1)!
and prefer:
% factorial (N, FactN) computes FactN as factorail of N.
% 0! = 1
factorial(0, 1).
factorial(N, FactN) :-
N > 0,
Nminus1 is N - 1,
factorial(Nminus1, FactNminus1),
% N! = N * (N - 1)!
FactN is N * FactNminus1.
Use standardized comments
We use the same type of comments as the pldoc package of SWI-Prolog..
File/module comments
These comments should indicate the contents (free text, a few lines) and
author of the file (@author
tag). If necessary, the license and
copyright can also be added with the @license
and @copyright
tags.
% Hello World
%
% This file prints out the "Hello world!" string on the standard output
%
% @author Myself
% @license GPL
% @copyright 2008, INRIA, Projet Lifeware
Predicate preambles
The predicate preamble should indicate first the type and mode
description(s) on a line (lines) starting with a double comment sign.
Then comes a text description of what the predicate does, and finally
optional explanatory tags like @param, @throws
, etc.
%% factorial (+N:int, ?FactN:int)
%
% computes FactN as factorial of N.
% N! = N * (N - 1)!
% 0! = 1
%
% @param N input integer
% @param FactN result of the computation
factorial(0, 1).
factorial(N, FactN) :-
N > 0,
Nminus1 is N - 1,
factorial(Nminus1, FactNminus1),
FactN is N * FactNminus1.
Use assertions
Use assertions as much as possible: they let you avoid verbose comments, while allowing a useful verification upon execution.
For example, the conditions for the arguments of a predicate to be valid are usefully verified by assertions.
assertion(Condition, Error) :-
(
Condition,
!
;
throw(error(Error))
).
p(X) :-
...
assertion(X > 0, 'non positive argument in p/1'),
...
Note as well that an assertion is often preferable to a comment because it’s more trustworthy: an assertion is forced to be pertinent because it is verified upon each execution, while a comment can quickly become obsolete and then becomes actually detrimental to the comprehension of the program.
Comments line by line in imperative code
When writing difficult code, and particularly in case of highly imperative code with a lot of global variables modifications or asserts and retracts, it is sometime mandatory to comment inside the body of clauses to explain the implementation of the algorithm encoded here, or to follow successive modifications of invariants that the predicate must maintain. Once more, if there is some difficulty commenting is mandatory, for each program line if necessary.
How to choose identifiers
It is hard to choose identifiers whose name evokes the meaning of the corresponding portion of the program. This is why you must devote particular care to this, emphasizing clarity and regularity of nomenclature.
Try to keep names pronounceable and avoid two names with the same
pronunciation. Prefer writing words that represent numbers as numbers
that as words (pred2
rather than pred_two
).
Predicates should usually have as name a noun (or noun phrase), adjective, prepositional phrase or verb phrase:
sorted_list, well_formed_tree, parent
(nouns or noun phrases);well_formed, ascending
(adjectives);in_tree, between_limits
(prepositional phrases);contains_duplicates, has_sublists
(indicative verb phrases).
If a predicate is understood procedurally, that is, its job is to do
something, rather than to verify a property, its name should be an
imperative verb phrase (e.g. remove_duplicates
and not
removes_duplicates
).
Separate words in predicate names (and all term functors) by underscores
int_of_string
, not intOfString
, this allows to keep the only capital
letters as markers of variables.
Identify auxiliary predicates with _aux, _rec, _x
, etc.
_aux
is the standard suffix for auxiliary predicates, however if two
predicates are related and none is really an auxiliary of the other
_rec
might be useful. If a hierarchy of auxiliary predicates is used,
then _x, _xx
and so on allows for an easy identification of such a
sequence.
Do not use abbreviations for global names
Global identifiers (including predicate names and global variables) can be long, because it’s important to understand what purpose they serve far from their definition.
Order the arguments
When meaningful, place arguments in the following order: inputs, intermediate results (used especially in auxiliary predicates), and final results.
Also consider how arguments map to English: mother_of(A, B)
is
ambiguous since it can be read as “A is the mother of B” or “the mother
of A is B”. Naming the predicate mother_child
correspondingly with its
arguments would eliminate the ambiguity.
Always give the same descriptive name to arguments which have the same meaning
If necessary, make this nomenclature explicit in a comment at the top of the file); if there are several arguments with the same meaning then attach numeral suffixes to them.
Construct variable names with mixed-case letters, using capitalization to set of words
E.g. ResultsSoFar
instead of Results_so_far
.
Local identifiers can be brief, and should be reused from one predicate to another
This augments regularity of style. Specifically, use:
I, J, K, L, M, N
for integers;L, L1, L2, L3
for lists;C, C1, C2, C3
for single characters or ASCII codes;A, B, C, ... , X, Y, Z
for arbitrary terms;H
andT
for head and tail of a list (when better names are not conveniently available, like[T | Trees]
).
Compiler warnings
Compiler warnings are meant to prevent potential errors; this is why you absolutely must heed them and correct your programs if compiling them produces such warnings. Besides, programs whose compilation produces warnings have an odor of amateurism which certainly does not suit your own work!
Notably, use anonymous variables to avoid the Singleton variable
warning. You can still give them a useful name by adding it (starting
with a capital letter) after the underscore.
Programming guidelines
How to program
Always put your handiwork back on the bench, and then polish it and re-polish it.
Write simple and clear programs
When this is done, reread, simplify and clarify. At every stage of creation, use your head!
Subdivide your programs into short predicates
Easier to maintain.
Factor out snippets of repeated code by defining them in separate predicates.
The sharing of code obtained in this way facilitates maintenance since every correction or improvement automatically spreads throughout the program. Besides, the simple act of isolating and naming a snippet of code sometimes lets you identify an unsuspected feature.
Never copy-paste code when programming
Pasting code almost surely indicates introducing a default of code sharing and neglecting to identify and write a useful auxiliary predicate; hence, it means that some code sharing is lost in the program. Loosing code sharing implies that you will have more problems afterwards for maintenance: a bug in the pasted code has to be corrected at each occurrence of the bug in each copy of the code!
Moreover, it is difficult to identify that the same set of 10 lines of code is repeated 20 times throughout the program. By contrast, if an auxiliary predicate defines those 10 lines, it is fairly easy to see and find where those lines are used: that’s simply where the predicate is called. If code is copy-pasted all over the place then the program is more difficult to understand.
In conclusion, copy-pasting code leads to programs that are more difficult to read and more difficult to maintain: it has to be banished.
Exceptions
Don’t be afraid to define your own exceptions in your programs, but on the other hand use as much as possible the exceptions predefined by the system. When raising such exceptions include all the information that will be necessary for debugging (for instance what data was of a type different than expected).
Do not forget to handle the exceptions which may be raised by a call
with the help of a catch
. Avoid catching exceptions which were not for
this level, and if necessary re-throw them after having reset a local
invariant.
When to use “mutables”
Mutable values (global variables or assert/retracts) are useful and sometimes indispensable to simple and clear programming. Nevertheless, you must use them with discernment. Use them only when necessary.
How to optimize programs
Pseudo law of optimization
No optimization a priori. No optimization a posteriori either.
Above all program simply and clearly. Don’t start optimizing until the program bottleneck has been identified (in general a few routines). Then optimization consists above all of changing the complexity of the algorithm used. This often happens through redefining the data structures being manipulated and completely rewriting the part of the program which poses a problem.
Justification
Clarity and correctness of programs take precedence. Besides, in a substantial program, it is practically impossible to identify a priori the parts of the program whose efficiency is of prime importance.
When necessary, do not forget about tail-recursion, difference lists and garbage collecting through backtrack.
Managing program development
We give here tips from veteran programmers, which have served in developing the compilers which are good examples of large complex programs developed by small teams.
How to edit programs
Use an editor providing syntax coloring and indentation.
The following two commands are considered indispensable
- Emacs:
CTRL-C-CTRL-C
orMeta-X compile
Vi::make
launches re-compilation from within the editor (using themake
command). - Emacs:
CTRL-X-`
Vi::cn
puts the cursor in the file and at the exact place where the compiler has signaled an/the next error.
The ESC-/
command (dynamic-abbrev-expand) for Emacs and CTRL-N
(complete) for Vi automatically completes the word in front of the
cursor with one of the words present in one of the files being edited.
Thus this lets you always choose meaningful identifiers without the
tedium of having to type extended names in your programs: it easily
completes the identifier after typing the first letters and can either
propose possible completions one after the other or present a menu of
all available ones.
Using tags (etags or ctags1) allows to easily navigate to the file where some predicate is defined and back.
How to develop as a team: version control
Users of the CVS and SVN software version control systems are never run out of good things to say about the productivity gains it brings. This system supports managing development by a team of programmers while imposing consistency among them, and also maintains a log of changes made to the software. Documentation about those is available in the Lifeware doc repository.