# Definite Clause Grammars

## Not Just for Parsing Anymore

### Abstract

Definite Clause Grammars (DCG) have proved to be a

system to build parsing systems simply and

effectively. Unfortunately, DCG are so effective as a parser

building tool, that their other uses are too often ignored -- it

seems that DCG have been pigeon-holed into a parser building

niche. This paper demonstrates the common usage of DCG and then

expands these common usage patterns into areas other than

parsing, showing that DCG are effective means of increasing

programming efficiency while simultameously maintaining (or even

clarifying) declarative semantics.

### Introduction to DCG

Definite Clause Grammars (DCG) are syntactic extentions to

Prolog that allow a subset of *definite programs*

^{1} to

be written as Prolog programs. DCGs, and the definite programs

they represent, facilitate, among other things, an almost

unchanged translation of grammars defined in Bacchus-Naur Form

(BNF) into (parts of) Prolog programs -- DCG is a useful tool

for generating parsers and scanners.

Take, for example, a simple command-control language for,

e.g. a robot-arm,

^{2} issuing

commands as per the following:

`up down up up down`

An example representation of the above grammar in BNF for the

above language is as follows ...

::=

< move >< step > < move >::=

< move >< step >::=

< step >up::=

< step >down

... and the equivalent Prolog program using DCG is the

following:

move --> step, move.

move --> step.

step --> [up].

step --> [down].

This nearly direct translation is simply amazing, especially as

compared to other programming language families. It is

*inconceivable* in other programming languages to

construct a scanner for the above language so concisely in the

native representation. The other amazing facet, not discussed

here, is that the scanners and parsers scale with the grammar's

complexity *linearly* ... writing scanners and parsers in

other programming languages becomes much more difficult with

increasing complexity, because, usually, such scanners and

parsers increase in complexity *polymonially* or even

*exponentially* to the complexity of the grammar.

Unfortunately, perhaps, because the above scanner and

associated parsers are so facilitated by DCGs, the other uses

of DCGs are too often overlooked. Let's step back from the

application of DCGs and examine the essentials -- DCGs provide a

system: accepting some atoms and updating the state of the world

or rejecting others and backtracking to find a match, so DCGs

provide (in other words, guarantee) a limited subset symbols and

associated actions on those symbols.

Type systems can be viewed in a similar fasion: they accept

some inputs (compiling and executing on accepted instantiated

values) and reject others (causing a compilation failure). The

advantages of typeful systems has been covered exhaustively in

the literature, and programming systems developed recently

accept programming with types as a matter of course. Viewing

DCGs in a similar light of types can convey the advantages of

types in a programming language without types.

One of the oft-trumpeted advantages of programming with

(static) types is that programs with type declarations execute

faster than equivalent programs without types (or with dynamic

typing). Why? Types form a duality: those things acceptable

to perform computations, and those other things that are not to

be used in the computation. Given that duality, the system can

perform the computation in the safety that all the values are

valid (in that they conform to the formal type). Under a

dynamic typing system, the system must first check the type of

each value, ensuring that the value's type is acceptable, before

it can perform any computation. In a statically typed system,

every check is compiled away (verified) before execution

occurs.

Definite Clause Grammars can provide the same service that

static typing does. In the usual case, a scanning/parsing

system receives a language (set of data) that (we hope) conforms

to the grammar. However, we can build a system where we control

the grammar definition AND the input data set. A good example

to examine where this approach yields effective results is in

the domain of generate-and-test problem-solvers.

### SEND + MORE = MONEY

#### (Cryptarithmic problem solvers)

Cryptarithmic problem solvers take a trivially encrypted

mathematical problem and determine what that problem

represents. The most general case is that a symbol stands for

any digit at each position, e.g.:

6 . .

x # . .

-------------

# . .

# . . .

+ # 5 . 5

-------------

# . 5 . 4 .

where # stands for any digit other than 0, and . stands for any digit

The above problem solved in the usual manner would take the

generate-and-test approach:

digit(0). digit(1). digit(2). digit(3). digit(4).

digit(5). digit(6). digit(7). digit(8). digit(9).

first_digit(X) :- digit(X), X > 0.

And would provide ways to construct representations of

(composite) numbers and translations to and from these

representations and the "actual" numbers they represent (so, for

example, the system can use the standard operators to perform

arithmetic):

as_number(Digits, Number) :- as_number_aux(Digits, 0, Number).

% an auxilary pred to make the above interface pred tail recursive^{3}

as_number_aux([], Num, Num).

as_number_aux([Digit|Digits], Num, Result) :-

NewSum is Digit * 10 * Num,

as_number_aux(Digits, NewSum, Result).

as_list(0, []).

as_list(N, [Digit|Rest]) :-

N > 0,

Digit is N mod 10,

Remainder is N // 10,

as_list(Remainder, Rest).

Then, using the above, one converts the problem as written into

the equivalent representations and asks the system to find the

solution (note that the above list representation puts numbers

in reverse order: units are the first element of the list):

dots([SixNum, Multiplier, Ones, Tens, Hundreds, Solution]) :-

digit(A), digit(B),

as_number([B, A, 6], SixNum),

digit(C), digit(D), first_digit(E),

as_number([C, D, E], Multiplier),

Ones is SixNum * C, Ones < 1000,

Tens is 10 * D * SixNum, Tens < 100000,

Fivers is SixNum * E, as_list(Fivers, [5, _, 5, _]),

Hundreds is Fivers * 100,

Solution is Ones + Tens + Hundreds,

as_list(Solution, [_, 4, _, 5, _, _]).

A Pentium Windows system using SWI Prolog

solves this problem

in 0.79 seconds, or 476,919 inferences in 0.79 seconds (602827

Lips). Pretty impressive, and, for this

general type of problem there's no need to improve on this

generic generate-and-test methodology.

What happens with a cryptarithmic problem with more stringent

constraints, such as uniqueness? To solve the equation ...

SEND

+ MORE

------

MONEY

... one could use the generate-and-test approach, and enforce

that each letter represents an unique digit:

send_more_money([Send, More, Money]) :-

first_digit(S), first_digit(M),

digit(E), digit(N), digit(D), digit(O), digit(R), digit(Y),

all_different([S, E, N, D, M, O, R, Y], []),

Send is S * 1000 + E * 100 + N * 10 + D,

More is M * 1000 + O * 100 + R * 10 + E,

Money is M * 10000 + O * 1000 + N * 100 + E * 10 + Y,

Money is Send + More.

all_different([], _).

all_different([Num|Rest], Diffs) :-

not(member(Num, Diffs)),

all_different(Rest, [Num|Diffs]).

And such an approach, as above, is simple to implement, and

easy to understand, but the runtime cost is enormous -- it took

1,571,755,688 inferences in 1817.34 seconds (864865 Lips) to

find the solution. Instead, let us

reexamine the problem constraint (uniqueness) and use it to

provide information to the problem solver so as to facilitate

its effort, and at the same time retain the clarity of the

declarative syntax.

The uniqueness constraint tells us that each letter must

represent an unique digit. This can be described inductively:

- S can be any first_digit, or [1..9];

- E can be any digit except the digit captured by S,

or [0..9] - [S];

- N can be any digit except the digits captured by S and E,

or [0..9] - [S, E];

- ... etc.

Looking at the above inductive description of the unique

constraints shows a consumer pattern very much like how a DCG

system consumes symbols from a list. So, we'll implement just

such a system; in this particular case the symbols of the list

are digits. As we rewrite the system to use DCG to replace the

generate-and-test predicates, the `digit/1`

predicate

becomes a definite predicate that consumes a digit from the

(input) digit list and the `first_digit/1`

predicate

becomes definite, as well:

takeout(X, [X|Y], Y).

takeout(X, [H|T], [H|R]) :- takeout(X, T, R).

digit(X) --> takeout(X).^{4}

first_digit(X) --> digit(X), { X > 0 }.

Then the solution predicate changes in that it now operates

under the structure of DCG with the input list being the

enumerated ten digits (and we also include an interface

predicate so that the user is not burdened with providing the

digits):

send_money_quickly([Send, More, Money]) :-

do_send_moola([Send, More, Money], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], _).

do_send_moola([Send, More, Money]) -->

first_digit(S), first_digit(M),

digit(E), digit(N), digit(D), digit(O), digit(R), digit(Y),

{ Send is (S * 1000) + (E * 100) + (N * 10) + D,

More is (M * 1000) + (O * 100) + (R * 10) + E,

Money is (M * 10000) + (O * 1000) + (N * 100) + (E * 10) + Y,

Money is Send + More }.

Did you notice that the above code now has *no uniqueness check*? By translating the code into a definite

predicate, we

*guarantee*that each logical variable

holds an unique digit at unification! What is occuring here, by

using DCG to assign unique digits is equivalent to creating a

type for each logical variable, and each successive type is more

restrictive than its predecessor. The code (algorithm) changed

very little structurally, but what does this DCG-as-types

modification buy us?

Quite a bit. The reduction in cost at runtime is astounding!

This new system solves the equation using only 8,215,467

inferences in 36.91 seconds (222562 Lips). This is nearly

*fifty times* faster than the fully nondeterministic

version presented first. By iteratively restricting the search

space with DCG we have simulated a type system that allows the

system to find a solution much more efficiently *without sacrificing the clarity of the declarative semantics of the problem description*.

### Summary for DCGs as Types

The above example demonstrates when one needs to search a

restricted solution space, DCG can provide runtime benefits over

pure nondeterminism without sacrificing clarity of code (as so

many other "optimization" techniques do). The above example

also demonstrates that if the search space becomes more

restrictive in a predictable way over the duration of the

search, DCG can provide enormous runtime benefits.

### DCG Implementation of Dynamic Programming

DCGs are effective parser generators and are commonly used as

such, we've also seen that DCGs can simulate types and reap the

benefits of static type systems in a dynamically typed

environment. DCGs have other uses as well. One such use is to

simulate *dynamic programming*,

^{5} which

computes solutions from an iterative bottom-up approach instead

of (usual for Prolog and other languages that rely on recursion)

the recursive top-down approach.

Dynamic programming shines where the solution must be

approached incrementally from a known-good set of states,

usually these are situations were a declarative top-down

description of the solution leads to a combinatorial explosion

of attempted solutions that can overwhelm any computational

resource. An excellent example of such a problem is computing

the nth Fibonacci number.

### Only some Fibonacci numbers declaratively

Below is the formula to find the nth Fibonacci number:

fib n | n == 0 = 1

| n == 1 = 1

| otherwise = fib (n-1) + fib (n-2)

Or, as translated (naively) into Prolog:

naive_fib(0, 1).

naive_fib(1, 1).

naive_fib(X, Y) :-

X > 1,

Idx1 is X - 1,

Idx2 is X - 2,

naive_fib(Idx1, A),

naive_fib(Idx2, B),

Y is A + B.

This is all very well and good, but, unfortunately, the above

specification is exponentially recursive ... not only does it

take a long time to find a solution, but it can only find the

first few decades of solutions before running out of available

computational resources (memory). A query of

`naive_fib(20, X)`

takes 65,671 inferences in 4.54

seconds (14476 Lips) to find the

solution, but a call for the 30th

Fibonacci number causes an "out of local stack" error.

The problem with the top-down approach is that it attempts

to find the solution by finding the two previous (unknown)

solutions, until it locates a known answer. Unfortunately, the

bottom is too far down to be reached with the resources

available, and the problem does not make conversion into a

tail-recursive solution practicable.

### Any Fibonacci, quickly, with DCGs

DCGs solve this problem by simulating dynamic programming

... approach the solution from a set of known states, and keep

building from that known state until the system finds the

solution. This approach has the additional advantage that since

it is iterative, it eliminates the exponential recursive

explosion that the previous declarative approach suffered.

The idea is use a list^{6 } to *memoize*

the result set as the solution space grows, until the system

reaches the desired solution;

^{7} that is

where DCGs assist us in maintaining a declarative description

while controlling resources. Declaratively, then, if the

requested Fibonacci number is at the head of the solution list,

then return it:

fibonacci(X, [X|_]) --> [].

Otherwise, if we're at then end of the list and still have not

found the solution, push the next two Fibonacci numbers onto the

result set and keep searching:

fibonacci(X, [A, B]) -->

next_fib(X, A, B, Fib1),

next_fib(X, B, Fib1, Fib2),

fibonacci(X, [Fib1, Fib2]).

Finally, if we're in the middle of the result set and have not

found the solution, keep searching:

fibonacci(X, [A, B|Rest]) -->

succ_push(X, A),

fibonacci(X, [B|Rest]).^{8}

The DCG housekeeping methods are what one would expect:

% Ensures the current fib is not the one we need, generates a new

% one, and pushes the current one onto the result set.

next_fib(X, fib(Idx1, Num1), fib(Idx2, Num2), fib(Idx3, Num3)) -->

succ_push(X, fib(Idx1, Num1)),

{ Idx3 is Idx2 + 1,

Num3 is Num1 + Num2 }.

% add a definite branch to gt that pushes the compared-to element

% onto the end of the list (transformed into a difference list)

succ_push(A, B, Old, New) :-

gt(A, B),

concat([Old|T1] - T1, [B|T2] - T2, New).^{9}

As before, we retain the declarative nature using DCGs to solve

the problem, but we also obtain the benefit of much more

powerful computational ability with much less runtime cost,

^{10} to wit:

quick_fib(1450, X) finds the

solution using 12,327 inferences in

0.02 seconds (615464 Lips) -- out-of-reach and impossibly fast

for the top-down standard predicate.^{12 }

### Conclusion

Definite Clause Grammars (DCGs) have long been known to be an

effective tool to generate scanners and parsers. In this

article, we explore alternative uses for DCGs, such as emulating

static type system and implementing dynamic programming system.

We have seen that, unlike other "optimization" techniques, DCGs

maintain or even improve the clarity of declarative

specifications while at the same time providing the benefits

of these other systems. These benefits are manifested as

improvements by orders of magnitude in both computational power

and the speed at which the runtime achieves desired

solutions.

## Endnotes

^{ 1} | Definite programs and Prolog programs using DCGto model definite programs is discussed in detail in [DM93]. |

^{ 2} | This example is developed in [Bratko01], chapter 21, with two alternative BNF representations offered. |

^{ 3} | A naive version of as_number([], 0). ... but this is not tail recursive, an so may not benefit from |

^{ 4} | We use `takeout/3` instead of`delete/3` because `delete/3` is notreversible, eliminating the desired backtracking to explore alternate solutions. The definition, use, and a wonderful explanation of `takeout/3` come from[Fisher99], § 2.7. |

^{ 5} | Dynamic programming is discussed in detail in[RL99], which provides an implementation of a generic dynamic programming algorithm in the Haskell programming language. |

^{ 6} | Actually, [RL99] uses a dictionary, not a list, as the data structure to memoize the known state, but the difference list approach was so effective that it was unnecessary to use a different data structure. |

^{ 7} | As each Fibonacci number is computed from the previoustwo Fibonacci numbers, I found it most convenientto use (specifically) a difference as the data structure to memoize thelist results. |

^{ 8} | We have two iteration branches for this predicate. The first adds two more Fibonacci numbers when we've come to the final two Fibonacci numbers in the result set; the second continues iteration because there are more than two remaining Fibonacci numbers. As we need two Fibonacci numbers to compute the next Fibonacci number, these two branches guarantee that we perform that computation when we reach the last two numbers,and not before. This approach forms the basis of this dynamic programming system. |

^{ 9} | The implementation of `concat/3` is straightfrom [Bratko01], § 8.5.3. ( `concat(A1-Z1, Z1-Z2, A1-Z2)` |

^{ 10} | Not only that, but this solution also provides the history of all the Fibonacci numbers prior to the requested one in a difference list by calling `fibonacci/4` directly; ^{11}the traditional top-down solution provides no such history. |

^{ 11} | The resulting difference list can be converted into simple_list([]) --> []. |

^{ 12} | An implementation using monads in Haskell for the fibonacci computer is available at logicaltypes.blogspot.com |

## Works Consulted

[Bratko01] | Prolog Programming for Artificial Intelligence, 3rd Ivan Bratko, Pearson Education Limited, Essex,ed. England, 2001. |

[DM93] | A Grammatical View of Logic Programming, PierreDeransart and Jan Mal/uszyn'ski, MIT Press, Cambridge, Massachusetts, 1993. |

[Fisher99] | `prolog :- tutorial` , J. R. Fisher,http://www.csupomona.edu/~jrfisher/www/prolog_tutorial/contents.html, 1999. |

[RL99] | Algorithms: A Functional Programming Approach,Fethi Rabhi and Guy Lapalme, Addison-Wesley, Essex, England, 1999 |

## Definitions

backtrack(ing): | exploring alternate solutions by attempting new values before the point of failure. |

BNF: | acchusBaurNorm is a system to describeFgrammars of (programming) languages. |

DCG: | efiniteDlauseCrammar(s)G |

difference list: | Difference lists are list that allow access to elements within the list directly; they avoid the need to "cdr down" (a Lisp term mean to walk the list from its first element to the desired element) the list to reach the desired elements. |

Lips: | ogicalLnferencesIerPecondS |

memoize: | To memoize a result is to keep that result locally active so that it is not necessary to recompute it. |

## Solutions to Problems

The dots problem has the following

(only) solution:

645

x 721

-----

645

12900

+ 451500

--------

465045

The solution to `SEND + MORE =`

is 9567 + 1085 = 10652.

MONEY

The 20th Fibonacci number is

10946 (computed from `fibonacci(fib(20, X), [fib(1,1), fib(0,1)], A, B)`

).

The 1450th Fibonacci number is

7.79078e+302; the 1500th Fibonacci number causes a

floating-point overflow error.

Last modified: Mon Aug 09 16:15:25 Eastern Daylight Time 2004

Copyright © 2003, 2004, Cotillion Group, Inc. All rights reserved.