Prolog programming guidelines

This is a set of reasonable guidelines for formatting Prolog programs, freely adapted from Caml guidelines and M. Covington’s Prolog coding guidelines. They aim at reflecting the consensus among the programmers of the Lifeware team.

Nevertheless, all detailed notifications of possible errors or omissions will be noted with pleasure. To send your comments: Sylvain.Soliman@inria.fr.

Thanks to all those who have already participated in the critique of this page: Thierry Martinez and Julien Martin

Program formatting guidelines

Table of Contents

Lexical conventions

Spaces should surround all operator symbols, with the notable exception of the comma. It has been a great step forward in typography to separate words by spaces to make written texts easier to read. Do the same in your programs if you want them to be readable.

How to write terms and pairs

A tuple is parenthesized and the commas therein (delimiters) are each followed by a space, same for a term.

(1, 2)
append(List1, List2, List12)

How to write lists

Write [H | T] with spaces around the | (since | is considered as an infix operator, hence surrounded by spaces) and [1, 2, 3] (since , is a delimiter, hence followed by a space). The empty list is written without spaces [].

How to write operator symbols

Be careful to keep operator symbols well separated by spaces: not only will your formulas be more readable, but you will avoid confusion with multi-character operators.

Prefix operators (+, - or \) are not separated from their arguments.

I is X + 1
J is +2 ** -1

Criticism

The absence of spaces around an operator improves the readability of formulas when you use it to reflect the relative precedences of operators. For example X*Y + 2*Z makes it very obvious that multiplication takes precedence over addition.

Response

This is a bad idea, a chimera, because nothing in the language ensures that the spaces properly reflect the meaning of the formula. For example X * Z-1 means (X * Z) - 1, and not X * (Z - 1) as the proposed interpretation of spaces would seem to suggest. Besides, the problem of multi-character symbols would keep you from using this convention in a uniform way: you could not leave out the spaces around the multiplication to write 3*-2. Finally, this playing with the spaces is a subtle and flimsy convention, a subliminal message which is difficult to grasp on reading. If you want to make the precedences obvious, use the expressive means brought to you by the language: write parentheses.

How to write long character strings and atoms

Long character strings and atoms can be split by indicating the string continuation character \ at the end of the line. However since white spaces on the beginning of next line will be included in the string/atom, the next line cannot be properly indented. Strings/atoms that cannot be built from smaller parts should thus be kept on a single line. This is the only exception to the 80 columns limit (see below).

When to use parentheses within an expression

Parentheses are meaningful: they indicate the necessity of using an unusual precedence. So they should be used wisely and not sprinkled randomly throughout programs. However Prolog precedences can be tricky, so prefer over-parenthesizing to ambiguity.

Arithmetic operators: the same rules as in mathematics

For example 1 + 2 * X means 1 + (2 * X).

Boolean operators: be careful

/\ and \/ have the same precedence! For example true \/ false /\ X means (true \/ false) /\ X.

And with FD operators things change again: #/\ is left associative and has precedence 720, whereas #\/ is right associative and has precedence 730.

How to write sequences of program clauses

Skip one line between clauses of a single predicate, and skip two line between clauses of different predicates.

Indentation of programs

Width of the page

The page is 80 columns wide.

Justification

This width makes it possible to read the code on all displays and to print it in a legible font on a standard sheet.

Height of the page

A clause should always fit within one screen (of about 30 lines), or in exceptional cases two.

Justification

When a clause goes beyond one screen, it’s time to divide it into subproblems and handle them independently. Beyond a screen, one gets lost in the code. The indentation is not readable and is difficult to keep correct. Use auxiliary predicates if necessary.

What to indent

All lines except the first one of a clause should be indented. This line should only contain the head and :- or .. There should be only one subgoal on each line (even for ! and nl).

When it is necessary to delimit syntactic blocks in programs, use as delimiters isolated parentheses and indent the block content. Parentheses starting a block should be left alone, the block should be indented, and the closing parenthesis is aligned with the opening one.

head :-
   goal1,
   goal2,
   (
      goal3
   ;
      goal4
   ;
      goal5
   ).

DCGs are indented in the same way with --> behaving as :-, and {} as a block delimiter.

p(X, Y) -->
q(X),
r(X, Y),
{
   s(Y, Z)
},
t(Z).

Justification

The first line of the definition is set off nicely, so it’s easier to pass from definition to definition. The blocks are also clearly marked and thus stand out.

How much to indent

The change in indentation between successive blocks of the program is of 3 spaces. Both spaces and tab stops (ASCII character 9) have advantages, however hard spaces have been chosen (for instance to ease inclusion of source files into LaTeX or HTML documents). If necessary use your editor to transform tabs into spaces.

How to indent conditionals

The conditional construct (g1 -> g2 ; g3) should be used with parsimony and use parentheses for sub-blocks if necessary. Apart from that, all the symbols above (parentheses, -> and ; should be aligned and the goals in blocks.

p :-
(
   g1
->
   g2
;
   g3
),
...

Or for a more complex case:

p :-
(
   (
      cond1
   ;
      cond2
   ;
      cond3
   )
->
   then1,
   then2
;
   else1,
   else2
),
...

Note that a ; construct without -> (or the opposite) is just a special case of the conditional.

It is not necessary to add extraneous parentheses for successive conditionals, the aligned -> already remind of the usual if and elsif constructs.

Comparison

The O’Keefe convention used in “The Craft of Prolog” would lead to this:

head :-
goal1,
goal2,
(  goal31,
   goal32
;  goal33,
   goal34
),
(  goal41 ->
   goal42,
   goal43
;  goal44 ->
   goal45
;  goal46
).

And the Peter Ludeman version, as exposed in the ALP newsletter in 95, to that:

head :-
goal1,
goal2,
(  goal31
-> goal32
;  goal33,
   goal34
),
(  goal41
-> goal42,
   goal43
;  goal44
-> goal45
;  goal46
).

Whereas we choose:

head :-
goal1,
goal2,
(
   goal31
->
   goal32
;
   goal33,
   goal34
),
(
   goal41
->
   goal42,
   goal43
;
   goal44
->
   goal45
;
   goal46
).

We prefer to lose some space but have clearly identified blocks and block separators.

How to indent repeat

The repeat construct is not separated from the rest of the code, in order to make it stand out, always create a block (with parentheses) even if it is not strictly necessary. The simplest way to enforce this and to make clearer the scope of the cut ending such a construct is to always finish it with a ->.

Do not write:

process_queries :-
repeat,
read_query(Q),
handle_query(Q),
Q = 'quit',
!,
write('Ciao...\n').

But instead write:

process_queries :-
(
   repeat,
   read_query(Q),
   handle_query(Q),
   Q = 'quit'
->
   write('Ciao...\n')
).

How to indent for

The same reasoning as above applies, leading to:

p :-
write('begin\n'),
(
   for(I, 1, 100),
   J is I + 1,
   write(J),
   fail
;
   write('\nend\n')
).

How to indent \+

Since the negation \+ imposes one level of parentheses for multiple predicates we obtain:

p :-
write('begin\n'),
\+ singlepredicate(X, Y),
\+ (
   p(X),
   q(Y)
),
write('end\n').

How to indent long terms

Even with only one subgoal per line, and trying to avoid predicates with too many arguments, some terms might need to be cut in order to respect the 80 columns rule. In that case they should be cut either at a parenthesis, opening a block, with the follow-up indented and at most one argument per line (or if impossible after an operator).

p :-
catch(
   (
      goal1(
            argument1,
            argument2,
            argument3
      ),
      goal2,
      goal3
   ),
   catcher,
   (
      recovery1,
      recovery2
   )
),
...


q(Result) :-
...
Result is A + B + C + D + E + F + G +
   H + I + J + K.

In the last example, introducing intermediary variables would be better.

How to comment programs

Do not hesitate to comment when there’s a difficulty And conversely, if there’s no difficulty, there’s no point in commenting.

Avoid comments in the bodies of clauses

When possible, prefer one comment at the beginning of the predicate which explains how the algorithm that is used works. If a comment is inserted in the body, put it just before the concerned code (line or block).

Avoid vacuous comments

A vacuous comment is a comment that does not add any value, i.e. no non-trivial information. The vacuous comment is evidently of no interest; it is a nuisance since it uselessly distracts the reader. It is often used to fulfill some strange criteria related to the so-called software metrology, for instance the ratio number of comments / number of lines of code that measures something of no theoretical or practical interpretation. Absolutely avoid vacuous comments.

Thus avoid:

factorial(0, 1).                    % Factorial of 0 is 1.
factorial(N, FactN) :-
N > 0,                           % N is positive
Nminus1 is N - 1,                % Calculate N minus 1
factorial(Nminus1, FactNminus1), % recursion
FactN is N * FactNminus1.        % N! = N * (N - 1)!

and prefer:

% factorial (N, FactN) computes FactN as factorail of N.

% 0! = 1
factorial(0, 1).

factorial(N, FactN) :-
N > 0,
Nminus1 is N - 1,
factorial(Nminus1, FactNminus1),
% N! = N * (N - 1)!
FactN is N * FactNminus1.

Use standardized comments

We use the same type of comments as the pldoc package of SWI-Prolog..

File/module comments

These comments should indicate the contents (free text, a few lines) and author of the file (@author tag). If necessary, the license and copyright can also be added with the @license and @copyright tags.

% Hello World
%
% This file prints out the "Hello world!" string on the standard output
%
% @author Myself
% @license GPL
% @copyright 2008, INRIA, Projet Lifeware

Predicate preambles

The predicate preamble should indicate first the type and mode description(s) on a line (lines) starting with a double comment sign. Then comes a text description of what the predicate does, and finally optional explanatory tags like @param, @throws, etc.

%% factorial (+N:int, ?FactN:int)
%
% computes FactN as factorial of N.
% N! = N * (N - 1)!
% 0! = 1
%
% @param N input integer
% @param FactN result of the computation

factorial(0, 1).

factorial(N, FactN) :-
N > 0,
Nminus1 is N - 1,
factorial(Nminus1, FactNminus1),
FactN is N * FactNminus1.

Use assertions

Use assertions as much as possible: they let you avoid verbose comments, while allowing a useful verification upon execution.

For example, the conditions for the arguments of a predicate to be valid are usefully verified by assertions.

assertion(Condition, Error) :-
(
   Condition,
   !
;
   throw(error(Error))
).


p(X) :-
...
assertion(X > 0, 'non positive argument in p/1'),
...

Note as well that an assertion is often preferable to a comment because it’s more trustworthy: an assertion is forced to be pertinent because it is verified upon each execution, while a comment can quickly become obsolete and then becomes actually detrimental to the comprehension of the program.

Comments line by line in imperative code

When writing difficult code, and particularly in case of highly imperative code with a lot of global variables modifications or asserts and retracts, it is sometime mandatory to comment inside the body of clauses to explain the implementation of the algorithm encoded here, or to follow successive modifications of invariants that the predicate must maintain. Once more, if there is some difficulty commenting is mandatory, for each program line if necessary.

How to choose identifiers

It is hard to choose identifiers whose name evokes the meaning of the corresponding portion of the program. This is why you must devote particular care to this, emphasizing clarity and regularity of nomenclature.

Try to keep names pronounceable and avoid two names with the same pronunciation. Prefer writing words that represent numbers as numbers that as words (pred2 rather than pred_two).

Predicates should usually have as name a noun (or noun phrase), adjective, prepositional phrase or verb phrase:

  • sorted_list, well_formed_tree, parent (nouns or noun phrases);
  • well_formed, ascending (adjectives);
  • in_tree, between_limits (prepositional phrases);
  • contains_duplicates, has_sublists (indicative verb phrases).

If a predicate is understood procedurally, that is, its job is to do something, rather than to verify a property, its name should be an imperative verb phrase (e.g. remove_duplicates and not removes_duplicates).

Separate words in predicate names (and all term functors) by underscores

int_of_string, not intOfString, this allows to keep the only capital letters as markers of variables.

Identify auxiliary predicates with _aux, _rec, _x, etc.

_aux is the standard suffix for auxiliary predicates, however if two predicates are related and none is really an auxiliary of the other _rec might be useful. If a hierarchy of auxiliary predicates is used, then _x, _xx and so on allows for an easy identification of such a sequence.

Do not use abbreviations for global names

Global identifiers (including predicate names and global variables) can be long, because it’s important to understand what purpose they serve far from their definition.

Order the arguments

When meaningful, place arguments in the following order: inputs, intermediate results (used especially in auxiliary predicates), and final results.

Also consider how arguments map to English: mother_of(A, B) is ambiguous since it can be read as “A is the mother of B” or “the mother of A is B”. Naming the predicate mother_child correspondingly with its arguments would eliminate the ambiguity.

Always give the same descriptive name to arguments which have the same meaning

If necessary, make this nomenclature explicit in a comment at the top of the file); if there are several arguments with the same meaning then attach numeral suffixes to them.

Construct variable names with mixed-case letters, using capitalization to set of words

E.g. ResultsSoFar instead of Results_so_far.

Local identifiers can be brief, and should be reused from one predicate to another

This augments regularity of style. Specifically, use:

  • I, J, K, L, M, N for integers;
  • L, L1, L2, L3 for lists;
  • C, C1, C2, C3 for single characters or ASCII codes;
  • A, B, C, ... , X, Y, Z for arbitrary terms;
  • H and T for head and tail of a list (when better names are not conveniently available, like [T | Trees]).

Compiler warnings

Compiler warnings are meant to prevent potential errors; this is why you absolutely must heed them and correct your programs if compiling them produces such warnings. Besides, programs whose compilation produces warnings have an odor of amateurism which certainly does not suit your own work!

Notably, use anonymous variables to avoid the Singleton variable warning. You can still give them a useful name by adding it (starting with a capital letter) after the underscore.

Programming guidelines

How to program

Always put your handiwork back on the bench, and then polish it and re-polish it.

Write simple and clear programs

When this is done, reread, simplify and clarify. At every stage of creation, use your head!

Subdivide your programs into short predicates

Easier to maintain.

Factor out snippets of repeated code by defining them in separate predicates.

The sharing of code obtained in this way facilitates maintenance since every correction or improvement automatically spreads throughout the program. Besides, the simple act of isolating and naming a snippet of code sometimes lets you identify an unsuspected feature.

Never copy-paste code when programming

Pasting code almost surely indicates introducing a default of code sharing and neglecting to identify and write a useful auxiliary predicate; hence, it means that some code sharing is lost in the program. Loosing code sharing implies that you will have more problems afterwards for maintenance: a bug in the pasted code has to be corrected at each occurrence of the bug in each copy of the code!

Moreover, it is difficult to identify that the same set of 10 lines of code is repeated 20 times throughout the program. By contrast, if an auxiliary predicate defines those 10 lines, it is fairly easy to see and find where those lines are used: that’s simply where the predicate is called. If code is copy-pasted all over the place then the program is more difficult to understand.

In conclusion, copy-pasting code leads to programs that are more difficult to read and more difficult to maintain: it has to be banished.

Exceptions

Don’t be afraid to define your own exceptions in your programs, but on the other hand use as much as possible the exceptions predefined by the system. When raising such exceptions include all the information that will be necessary for debugging (for instance what data was of a type different than expected).

Do not forget to handle the exceptions which may be raised by a call with the help of a catch. Avoid catching exceptions which were not for this level, and if necessary re-throw them after having reset a local invariant.

When to use “mutables”

Mutable values (global variables or assert/retracts) are useful and sometimes indispensable to simple and clear programming. Nevertheless, you must use them with discernment. Use them only when necessary.

How to optimize programs

Pseudo law of optimization

No optimization a priori. No optimization a posteriori either.

Above all program simply and clearly. Don’t start optimizing until the program bottleneck has been identified (in general a few routines). Then optimization consists above all of changing the complexity of the algorithm used. This often happens through redefining the data structures being manipulated and completely rewriting the part of the program which poses a problem.

Justification

Clarity and correctness of programs take precedence. Besides, in a substantial program, it is practically impossible to identify a priori the parts of the program whose efficiency is of prime importance.

When necessary, do not forget about tail-recursion, difference lists and garbage collecting through backtrack.

Managing program development

We give here tips from veteran programmers, which have served in developing the compilers which are good examples of large complex programs developed by small teams.

How to edit programs

Use an editor providing syntax coloring and indentation.

The following two commands are considered indispensable

  • Emacs: CTRL-C-CTRL-C or Meta-X compile Vi: :make launches re-compilation from within the editor (using the make command).
  • Emacs: CTRL-X-` Vi: :cn puts the cursor in the file and at the exact place where the compiler has signaled an/the next error.

The ESC-/ command (dynamic-abbrev-expand) for Emacs and CTRL-N (complete) for Vi automatically completes the word in front of the cursor with one of the words present in one of the files being edited. Thus this lets you always choose meaningful identifiers without the tedium of having to type extended names in your programs: it easily completes the identifier after typing the first letters and can either propose possible completions one after the other or present a menu of all available ones.

Using tags (etags or ctags1) allows to easily navigate to the file where some predicate is defined and back.

How to develop as a team: version control

Users of the CVS and SVN software version control systems are never run out of good things to say about the productivity gains it brings. This system supports managing development by a team of programmers while imposing consistency among them, and also maintains a log of changes made to the software. Documentation about those is available in the Lifeware doc repository.


  1. see here for a Prolog ctags definition ↩︎