[an error occurred while processing this directive]
This document contains extracts from the third year project "3D
Physical Modelling with Definitive Notations" by Andy MacDonald (1997,
chapter 2 and appendix A). It provides useful background information
about definitive notations and an introduction to the EDEN, DoNaLD and
SCOUT notations implemented in the tkeden
interpreter and so is republished here.
This chapter describes a relatively new approach to computer based modelling, which uses a different type of software tool: the definitive notation. Firstly, I shall discuss the general concepts of definitive systems and notation, followed by a description of the particular definitive notation that I have investigated for this project. The chapter concludes with a section on how these notations can be used for modelling purposes. Most of the material that follows has been gained from personal experience during the course of the project, since very little reference material is available.
Real world phenomena can be modelled on a computer by describing them as problems that can be solved using computer programs. These problems and methods of solving them need to be somehow represented in the programming language being used. Conventional languages (i.e. procedural languages) use variables and procedures to model the problem and have a set flow of control for solving it. The variables refer to storage locations that hold data about the objects being modelled and their values collectively represent the current state of the problem. Procedures describe what to do with the data stored in the variables and consist of sequences of instructions whose side effects change the state of the problem. The computer has to be told explicitly when to change certain variables and how to compute them. This puts added burden onto the programmer since he/she has to decide when it is appropriate to update certain variables and has to include instructions to do so in their program.
Imagine a model that contains a table and a lamp, which can be moved around by specifying their positions. To represent this you would need a program that contained, among others, variables holding the positions of the table and the lamp. To make the lamp sit at the centre of the table, then, the position of the lamp needs to be the current table position, plus an offset to the centre of the table. If you wanted to move the table to a new position then the position of the lamp also needs to be changed requiring a procedure to change the table and lamp positions. Now whenever the table is moved, the lamp is moved to the new table's centre. If you also want to move the lamp to any position on the table, the procedure needs to be changed to take into account the position of the lamp on the table. In addition, to get an up to date picture of this model, a redraw procedure needs to be called every time something is moved. All these dependencies need to be programmed in explicitly, which can become a complex task with many dependant objects and dependencies that are more complicated.
The above example shows that while procedural languages can be used for modelling (and are) they provide extra effort that could be better spent on the actual modelling itself. It would be more convenient to be able to provide abstract definitions of objects in term of other objects being modelled. The computer can then handle these definitions implicitly, making sure that they are always true. This is especially important during development of the model, when objects and the dependencies between them need to be frequently changed. Only the actual definitions need to be changed whereas, with procedural languages, entire sections of code may need to be altered, which takes time and is prone to errors. A system that can manage definitions between objects is called a definition-based, or definitive system.
Definitive systems consist of variables, which hold data representing different objects, and definitions, which describe the dependencies between objects. A definition usually defines a single target variable by an expression containing one or more source variables. The target is then said to depend on the source objects and the expression governs how the target should be computed from them. These definitions are handled by the system, which ensures that whenever a variable is changed, the variables that depend on it are re-evaluated to reflect this change.
The following illustrates a typical definition:
t = f( s1, s2, ..., sn )
where t is the target variable, s1 ...
sn are the source variables and f is an
expression involving the source variables. This definition states that
t depends on s1 ... sn and is
computed from these source variables using the expression
f. Should any of the source variables change then the
expression is recomputed automatically so that this definition will
always be true. Two examples of definitive principles in use are
spreadsheet software and the UNIX make command. A
spreadsheet consists of a grid of cells (objects) which can hold
values or formulae (definitions). The formulae state how the value of
a particular cell can be computed from other cells
automatically. Whenever any cells are changed, formulae referring to
that cell are recomputed and a display action is invoked by the
system. Using the same principles, the make command uses
a script of dependencies to maintain files. An example dependency is
as follows:
program.o: program.c
cc -c -o program.o program.c
This states that the file program.o depends on the
file program.c and provides a UNIX command (cc -c
...<) for updating program.o. This command is the
updating action of the definition and in this case, since it has been
given by the user, it is an explicit action. In the case of the
spreadsheet, the display depends on all the cells and the display
action is an implicit action since it is pre-defined by the system.
In order to use definitions in modelling something, a definitive system needs to be implemented. The implementation of a definitive system involves developing a dependency maintainer, which is not a trivial task. Therefore, it is much more sensible to use a general- purpose definitive system with a language to specify the definitions. A language, which interfaces with a definitive system, is called a definitive notation. In the next section, I will describe the definitive system that I have used in this project, and its definitive notations.
The advantages of using definitive notations are as follows:
The main disadvantages are:
tkeden - A Definitive SystemThe definitive system I have examined and used for my project is
called tkeden. It runs in the X Windows environment
(and now on Windows as well - ed) and provides a graphical
user interface through the Tcl/Tk interface scripting language. The
system consists of a dependency maintainer along with pre defined
dependencies and support routines, which provide the interface between
the definitive system and the X Windows environment. Definitions can
be supplied in three different definitive notations, EDEN, DoNaLD and
SCOUT, which are described in more detail below. tkeden
provides an interpreter for the EDEN notation, and definitions written
in the other two notations are converted to EDEN definitions before
being interpreted. This means that definitions in DoNaLD and SCOUT can
be manipulated in both their own notations and in EDEN itself.
Upon running, tkeden provides two windows, one for
input and the other containing the current display, along with a set
of menus. Definitions are given to the system either through initially
loaded scripts (given on the command line) or by directly entering
them into the input window. This means that existing definitive
programs can be run by tkeden and while running, the
definitions can be changed by typing new ones into the input
window. As well as giving a form of interaction with the program, this
also eases development of models through experimentation. Input is
given as blocks of definitions each preceeded by the keyword
%eden, %donald or %scout, which
indicates the notation that the following definitions are written in.
(More notations are present in later versions of
tkeden - ed). List of definitions and other
information useful for debugging purposes are available through menu
commands that allow you to examine the objects being modelled and the
dependencies between them. The display window shows the current state
of the model as graphical objects defined with DoNaLD and
SCOUT. Interaction is possible with the display window, providing
definitive programs with simple graphical user interfaces.
Since tkeden is an experimental language and not
commercially available, there is very little reference material for
it. EDEN has a reference manual, The EDEN
Handbook. Not much information is available for the other two
notations. This means that the best way of understanding these
notations is to examine other peoples' programs and experiment. I have
provided, in appendix A, a description of the syntax of these
notations and a short reference to tkeden in order to
help the unfamiliar understand any definitions I may give.
EDEN is a general-purpose language that supports the concept of definitions. Its name comes from the name of its original interpreter, EDEN - an Evaluator of DEfinitive Notations. The language is not a purely definitive one but a hybrid language combining definitive and procedural programming. Each statement in a typical program is either a procedural style statement or a definition. When a procedural statement is encountered by the interpreter it is executed and will have the effect of evaluating an expression, assigning a value to a variable or calling a procedure. On the other hand, when a definition is encountered, an equivalent definition is set up internally, which will then be evaluated by the interpreter whenever this is necessary. Since some of the statements are procedural, the order of execution depends on the statement order. This is not the case between definitions where the interpreter handles the order of execution.
The syntax of EDEN is similar to that of C, in fact it can be considered as a subset of C with some additions. Like procedural languages, it has variables that represent storage locations, and can be assigned values from expressions that can contain other variables, operators and functions. These variables are known as read/write variables (RWVs). Unlike C, RWVs do not need to be declared before being used and are allocated storage when they first appear in the program. The type of a variable does not need to be given and depends on the type of the value assigned to it. If a variable of a given type is assigned a value of a different type then the variable's type is changed to that of the new value. The following are types that are essentially the same as those in C:
123 (decimal), 0456
(octal), 0x1f (hexadecimal)
'a'
1.23, .23,
1.23e-15
&int_var (address of
int_var's storage location)
There are also some additional types as given below:
@. This represents undefined and is actually a
constant, which says that the variable does not have a type. A
variable will have this value if it is used before having a value
assigned to it.
s is a variable holding a string and
i is an integer, then the expression s[i] is
the i-th character of s. Also,
s# is the length of string s.
[ 100, 'a', "string",
[1,2,3] ]. If l is a variable holding a list,
then l[i] is the ith element of the list and
is of the same type as that element. The length of the list is given
by l# which is the number of elements in the list. In
addition, two lists can be concatenated e.g. [1, 2,3] //
[3,4,5] gives [1,2,3,4,5].
$. The
elements of this list can be given aliases by including a
para keyword at the start of the function body, followed
by a corresponding list of names. All local variables used in the
function also need to be declared at the beginning of the function
body. This is done by providing a list of them after an
auto keyword, although their types do not need to be
given. The other main difference is that if no return value is given,
@ will be returned. Function definitions are indicated by
the keyword func or proc (there is currently
no difference between the two - the alternatives are provided to
enable the modeller to show their intention), for example:
func max {
para m; /* m is alias of first argument */
auto i; /* local variable */
for (i = 2; i <= $#; i++) /* for each argument */
if ($[i] > m) m = $[i]; /* keep max */
return m; /* return max of all arguments */
};
This function can be called by, for example,
max(l,3,5,2) which will return 5.
EDEN has the usual arithmetic, relational and logical operators as
C for handling the numerical types (e.g. +,
-, <, >, =,
&&, ||). The main operators for handling the
other types were given in the type descriptions above. Values can be
assigned to expressions through assignment statements which are of the
form var = expression, var += expression
etc. Other statements in the language, used for control flow, are the
usual C if-else, while, for and
switch.
The language also has two additional statements for giving
definitions, formula definitions and action specifications. These
statements are not executed directly, but when encountered, set up the
definitions internally. These statements specify that a variable, or
action, depends on a set of source variables. Whenever a variable has
a value assigned to it, any definition that includes it as a source
variable is re-evaluated. If the target of the definition is a
variable then it is called a formula variable (FV). These hold data in
a different way to the normal RWVs. An FV consists of two parts; the
data register (DR) and the formula expression (FE). The data register
holds the actual value of the FV and can be used in expressions, but
not written to (its value is set by the interpreter when
appropriate). In addition, the expression used in the definition is
held by the formula expression and it indicates how to calculate the
DR from the source variables. The FE can be written to and hence the
formula can be redefined. A formula definition is similar in format
to a variable assignment statement except it provides an abstract
definition of the variable. The syntax is the same as that of an
assignment with the = symbol replaced by the keyword
is. A typical formula definition is given as follows:
t is f( s1, s2, ..., sn )
where t is the target formula variable,
s1 ... sn are the source variables and
f is a formula which may be an expression or a
function. This statement forms a definition, which says that the FV
t depends on the source variables sl
... sn. The source variables can be either RWVs or FVs,
and this definition is re-evaluated if one of the RWVs has a value
assigned to it or if one of the FVs is recalculated. While it was
stated that a formula definition is similar to an assignment, they are
not the same. Formula definitions can be considered to always be
true (target variable is equal to the expression) while, on the
other hand, an assignment statement can only be assumed to be true at
the point directly after its execution. As an example, if you
wanted to define the following: "C lies at the mid point of line
AB, where the end points A and B are defined independently",
then this would be given as the following formula definition: C
is (A + B)/2.
For more complex behaviour than is possible with formula definitions, EDEN provides action specifications. These provide a way of specifying explicit actions, which are called when certain objects change. Action specifications are procedures, which depend on a set of source variables. Their syntax is similar to that of a function but with a list of variables, called the dependency list, which the function depends on. This is a comma-separated list, preceded by a colon, between the function name and its body. An example of an action specification follows:
proc print : a, b, c {
writeln(a, " ", b, " ", c);
};
In this specification, the procedure, print, depends
on the variables a, b and c. If
any one of these variables changes then the procedure is called, and
the values of the three variables are printed out. These
specifications are useful when some procedural side effect is required
for the updating action of a definition. They are usually used when
various objects need to be changed by a definition, or for display
actions that need to be called when variables are updated. Action
specifications are usually called by the interpreter when all formula
definitions have been re-evaluated so that all variables have up to
date values.
The name DoNaLD stands for Definitive Notation for Line Drawing and this notation allows the definition of two-dimensional shapes. It is a definitive notation for line drawing because once a shape is defined it is displayed in the form of a line drawing (i.e. outlines for shapes like rectangles and circles). This notation allows the specification of graphical diagrams with objects being modelled as graphical objects on the screen. With this graphical output, the results of modelling a problem can be seen immediately and in a convenient form. The notation itself consists only of definitions of shapes, and it is therefore a pure definitive notation. This means that there are no directly executable or procedure like statements, and every statement is a definition. A script in this notation is therefore like a specification of the dependencies between different shapes and their elements.
A specification in the DoNaLD notation consists of variables and
definitions between variables. Each variable in this notation
represents a shape and holds data that describes that shape. The type
of the variable indicates the shape that it represents and, unlike
EDEN, each variables type needs to be declared before use. There are,
essentially, three categories of types in DoNaLD: numerical scalars,
primitive shapes and structured shapes. The numerical scalar types are
simply types that hold single values such as int (integer
values), real (floating point values), char
(representative values of characters) and bool (binary
values). These are not drawable objects, and so are not displayed, but
can be used to define the elements of other objects. The primitive
shapes are the actual drawable shapes of the notation and the type
indicates what sort of shape to draw. A variable of one of these types
holds values which indicate how the shape is to be drawn (the
attributes of the shape e.g. width, height etc.).
Below are three types in this category (there are more but these three give the idea):
{ and }. These two
values are either the two Cartesian co-ordinates separated by a comma
or, in polar form, the modulus (distance along vector to point)
followed by an @ followed by the angle (that the vector
makes with the x-axis). Two example points are {30, 50}
and {10@0, 3}.
[{10, 20}, {30, 40}] represents a line from
(10,20) to (30,40).
circle function that
takes as arguments a point and a scalar and returns a circle. For
example, circle({10, 20}, 5) gives a circle at point
(10,20) with radius 5.
Complex shape types can be built which are essentially data
structures holding many other shape types. They are constructed by
opening a new shape type with the openshape keyword and
then giving a block of definitions within that shape. Each of these
definitions creates a variable of a given type, which then becomes
part of the shape. An example of a shape is given as follows:
openshape cross # Opens new shape 'cross'
within cross { # Start of block defining cross
line l1, l2
l1 = [ {10, -10}, {-10, 10} ] # Defines two lines
l2 = [ {10, 10}, {-10,-10} ]
} # End of block
These shape types group together their element shapes, which are
drawn when displaying a shape variable. The individual elements can be
accessed through path names similar to directory paths on file
systems. For example, to reference the first of the line
variables in the above you would use cross/l1.
Definitions in DoNaLD are syntactically the same as assignments in
EDEN (i.e. using the = symbol). Every definition in a
DoNaLD script sets up a dependency between the variable being defined
and any other variable used to define it. Therefore, if the position
of one object is dependent on another, it is useful to define the
position of that object in term of the position of the other. This can
be shown by defining a rectangle as follows:
openshape rectangle
within rectangle {
int width, height
point pos
line top, bot, left, right
width = 10
height = 20
pos = {20,30}
top = [pos + {0, height}, pos + {width, height}]
right = [pos + {width, height}, pos + {width, 0}]
bot = [pos + {width, 0}, pos}]
left = [pos, pos + {0, height}]
}
This example shows a rectangle being defined in terms of its width, height and position. The four lines, which represent the rectangle, are in turn defined in terms of these values. If the width, height or position of the rectangle is changed then the lines also change and the rectangle is redrawn in its new position. As can be seen, DoNaLD provides a set of arithmetic operators for use in definitions but it also has other geometric operators and functions. For more information on the available operators, see the appendix.
DoNaLD can handle several viewports at once, where a viewport is
like a canvas upon which objects can be drawn. Variables are assigned
to a viewport depending on which viewport is selected at the time the
variable was defined. You can change to a different viewport by using
the viewport keyword followed by the name of the
viewport. The image in the viewport is dependent on those variables
assigned to it. If any of those variables change then the image in the
viewport is redrawn. This means that there are no explicit drawing
commands since the displays are handled implicitly. Also since several
viewports can be handled, only one script is needed to specify many
displays.
This is a definitive notation for SCreen layOUT, hence the name. It
is used for describing the geometry and layout of windows on a display
screen using definitions. Since it defines the layout of the screen,
it provides an interface between the output of other definitive
notations and the actual display. For instance, a window can display
the image from one of DoNaLD's viewports. In tkeden, this
notation indicates how windows containing images are placed in the
display window of the system. It is useful to split the display into
windows so that different types of images can be produced by different
notations and each image may have different attributes (such as
colours and co-ordinate systems). In addition, since different images
are displayed in different windows, it provides a means of detecting
interaction events (e.g. pressing a mouse or keyboard button) and
which window they occurred in. This allows the development of simple
graphical interfaces with each window handling its own interaction.
In a similar way to DoNaLD, this notation consists of variables, which are defined by expressions possibly containing other variables. All such definitions set up dependencies between the target and source variables, which makes this a pure definitive notation. SCOUT variables represent displays, windows and components of windows, and their values describe these abstract objects. The type of these variables indicate what kind of object they represent and although the variables do not need to be declared, they have to remain of the same type.
Of the types available, there are numerical types such as
integer and string, types for describing
window components such as point, box and
frame, and the actual window and
display types. The numerical types are similar to the
equivalent ones in EDEN and hold the same type of values. Points are
equivalent to the point type in DoNaLD and represent a
position on a display. They are defined in terms of screen
co-ordinates and can be constructed by giving the x and y co-ordinates
in between braces. A box represents a rectangular region
of the display and holds the position of the top left and bottom right
corners of the region. Variables of this type are constructed as a
list containing two points (which represent the two corners). A
frame represents multiple regions and is just a list of
boxes. Some examples of these types are given below:
{100, 200} and {x, y}
(x and y are integer variables)
[{10, 20}, {30, 40}] and [p, q]
(p and q are points)
([{1, 2}, {20, 30}], [box1, box2])
A window specifies a region in which data can be displayed in some manner. This type can be described by the four following concepts:
The following gives an example of text and donald window variables, and shows how they are specified.
TYPE: Text
REGION: Given by a frame variable
CONTENT: Given as a string variable, which is rendered into the image
ATTRIBUTES: Values that indicate the text and background colour; the
window's border size and type, and the alignment of text in the window
EXAMPLE:
window button = {
type: TEXT
frame: ([button_pos, 1, 8])
string: "STOP"
bgcolour: "red"
fgcolour: "black"
border: 2
relief: "raised"
alignment: CENTRE
sensitive: ON
}
This definition creates a text window, called button,
which is located at button_pos (a point
value), is one character high and 8 characters wide. The image
displayed in the window is the centred text, STOP,
displayed in black on a red background. The window also has a raised
border of width 2.
TYPE: Donald
REGION: Given by a box variable
CONTENT: Given by the name of a DoNaLD viewport. The objects in the
viewport are drawn in this window.
ATTRIBUTES: Values that give the co-ordinate system of the viewport;
the foreground and background colours, and the border size and type.
EXAMPLE:
window display = {
type: DONALD
box: [topleft, botright]
pict: "VIEW"
xmin: 0
xmax: 100
ymin: 0
ymax: 100
}
This definition creates a donald window as a rectangle whose top
left corner depends on the point topleft and bottom right
corner depends on botright. It displays a picture, which
consists of objects in the viewport called VIEW. In
addition, the co-ordinates are set up so that the x and y ranges of
the viewport, from 0 to 100, are mapped into the window.
A display variable holds an ordered list of windows and indicates
which windows are mapped to that display. If any two windows overlap
in a display then the window earlier on in the display list is drawn
over the other. Display variables are constructed from a list of
variables as <windowl / window2 / ... /
windown>. In tkeden there is a pre-declared
display variable called screen that represents the actual
display window. Any SCOUT windows, which are in this display list,
are drawn onto the screen.
Although this notation is primarily for providing a specification
of the display layout, it also provides arithmetic, vector and list
operators. In addition, it provides a means of dependency control
through the if condition then expression else expression
endif statement. This is not used as a flow of control
expression; instead, it allows a variable to be defined by different
expressions depending on a condition. An example of its use is:
string s_text = if s=1 then "s = 1" else "s <> 1" endif;
If the definition of a window includes the sensitive:
ON statement (note more sensitive: types are
now available - ed) then the window is sensitive to mouse and
keyboard actions. When a mouse button is pressed and its pointer is in
this window, an EDEN variable is created. This variable is given the
name of the window followed by _mouse or, if it is a text
window, _mouse_box-number. The variable is of type
list and holds five values, which are the button number,
type of action, state of keyboard and pointer x and y position in the
window co-ordinates. If a key was pressed instead of a mouse button
then the mouse part of the variable name is replaced by key and the
first value of the list represents the key that was pressed. Using
these automatic definitions, any interaction within a window can be
detected and so an interface can be designed using SCOUT and
DoNaLD.
Models of real world objects are best described by their state and
the relationships between individual objects. Therefore, when
describing these objects in a computer-based model, the emphasis is
upon defining each object's state in terms of other objects. These
definitive notations lend themselves well to this state based
modelling, since the programmer is free to give abstract definitions
of objects without worrying about the updating of each object's state.
In essence, the programmer only has to state what is being modelled
and not how to model it. In addition, because objects are
automatically updated in term of their definitions, a definitive model
can cope with an evolving environment. This allows a different
approach to development, which encompasses observation, interaction
and experiment. Modelling of this nature is generally known as
Empirical Modelling and is based on modelling state as directly
experienced rather than behaviour as circumscribed. The first stage
in any sort of modelling is to take the system being modelled and
break it up into individual objects. These objects then need to be
represented somehow by the data structures of the language being
used. It is best to use structures that are closest in concept to the
objects themselves since they lead to a simpler representation and
more natural interface between objects. The next step is to identify
relationships and interactions between objects. With definitive
notations, these can be represented as dependencies between the
variables that represent the objects. When using Empirical Modelling
techniques these steps form an iterative task where, at first, only a
small number of objects and dependencies are defined. Through
observation and experiment, new objects and dependencies can be added
which brings the model closer to the actual system being modelled.
When modelling with tkeden, objects can be specified in
any of the three notations. Since each notation has its distinct
flavour, it is best to represent the object in the notation most
suited for it. This means that graphical objects are best defined in
DoNaLD; conceptual objects in EDEN, and SCOUT is best used for
describing the presentation of objects and how the user can interact
with them. Since the built-in types of these notations are often too
primitive for the representation of objects, it is often necessary to
represent them as lists or other structured types. It is important
that the elements of these structured types are used to specify the
attributes of the object and are the most appropriate for doing so. In
addition, you should only represent relevant attributes and any
dependencies need to be exploited. This is so that there is no
repetitive or redundant information and all data is correct and up to
date. It is also a good idea in EDEN, to provide functions for
manipulating these types in order to make definitions as simple as
possible.
In these notations, definitions are used for specifying a relationship between objects. They provide a way of telling the model that this relationship should always hold. Dependencies are fundamental in this kind of model, but it is also useful to have conceptual objects that interact with or act on various objects. These are called actors (the term 'agent' seems to be in use more recently - ed) and can be created using EDEN action specifications. An actor then depends on the state of some object or objects and should the state change, the actor then performs its function. Since an actor is procedural in nature, it can perform any task that a user interacting with the system could. In particular they can change the value of one or more variables and even redefine dependencies between variables (this includes creating new dependencies). This gives the model the ability to redefine itself and makes this kind of modelling very powerful.
An important element of any real-time model is a clock. Without any form of clocking, all the definitions in a system would be evaluated and the system would wait until the state of any variable changes. To keep the state of the variables changing requires an actor to perform modifications to certain variables. This actor needs to be triggered and so requires a periodic change of variable on which it depends. An example of an EDEN action, which provides a periodically changing variable, is as follows:
proc clocking : clock {
todo("clock++;");
};
This action updates a variable called clock by
incrementing it. Since the action also depends on clock
then it will be called again and to prevent it being called instantly,
the increment statement is placed on the todo list. The
todo list is a list of statements that are to be executed
after all definitions and actions have been dealt with. Now any actor
that depends on clock will be executed periodically and the state of
the model will continuously evolve.
The text was originally written by Andy MacDonald in 1997, was provided as a handout for the EM MSc module, and was edited and made into this HTML form by Ashley Ward with help from Meurig Beynon in June 2001.
[an error occurred while processing this directive]