WARNING: This text has been OCRd from the original paper and so will contain many typographical errors. It should be useful for searching, however. [Ash, August 2002]. Handbook iii~ ii~i~!~i~'iiiiii~!¸¸ ~!i~ilil Table of Content 1. Introduction 1 2. Fundamental Philosophy 4 2.1 Concept of EDEN Programming 4 2.1.1 Formula Definition 4 2.1.2 Action Specification 5 2.2 The EDEN System 5 2.2.1 Read/Write Variables 5 2.2.2 Formula Variables 6 2.2.3 Action Specification 7 2.2.4 Execution Strategy 8 3. EDEN Language Guide 10 3.1 Introduction 10 3.2 Lexical Conventions 10 3.3 Syntax notation 10 3.4 Comments 11 3.5 Identifiers (Names) 11 3.6 Keywords 11 3.7 Objects and Lvalues 11 3.7.1 Read/Write Variables 11 3.7.2 Function Definition 12 3.7.3 Formula Variable 12 3.7.4 Action Specification 12 3.8 Data Structures 12 3.8.1 @ (undefined) 13 3.8.2 Integer 13 3.8.3 Character 13 3.8.4 Floating point 14 3.8.5 String 14 3.8.6 Pointer 15 3.8.7 List 15 3.8.8 Function 16 3.9 Expressions 17 3.9.1 Lvalue 18 3.9.2 Primary expression 18 3.10 Operators 19 3.10.1 Arithmetic operators 19 ~. 1100.~ Sp~intge aod/irSattOpe rat°rs 119 , . • p ,,,o, .*.o** ,,,, ** ......................... 3.10.4 Relational operators 20 3.10.5 Equality operators 20 3.10.6 Logical operators 20 3.10.7 Conditional operator 21 3.10.8 Assignment operators 21 3.10.9 Precedence and order of evaluation 21 3.11 Procedural Statements 22 3.11.1 Expression Statements 22 3.11.2 3.11.3 3.11.4 3.11.5 3.11.6 3.1 3.1 3.1 3.1 3.1 3.1 3.1 3.1 Insert Statement 22 Append Statement 22 Delete Statement 22 Shift Statement 23 Compound Statement 23 1.7 Condition Statement 23 1.8 While Statement 23 1.9 Do Statement 23 1.10 For Statement 24 1.11 Switch Statement , 24 1.12 Break Statement 24 1.13 Continue Statement 25 1.14 Return Statement 25 3.11.15 Null Statement 25 3.12 User-defined Functions 25 3.13 Definitive Statements 26 3.13.1 Formula Definition 26 3.13.2 Action Specification 26 3.14 Miscellaneous Commands 27 3.14.1 Dependency Link 27 3.14.2 Query 27 3.15 Syntax Summary 28 3.15.1 Expressions 28 3.15.2 Statements 29 3.15.3 Function Definition 30 3.15.4 Action Specification 30 3.15.5 Dependency Link 30 3.15.6 Query Command 31 3.16 Pre-defined Variables 31 3.17 Pre-defined Functions 31 3.17.1 I/O Functions 32 3.17.2 Type Conversion Functions 32 3.17.3 String Functions 33 3.17.4 List Functions 33 3.17.5 Time Functions 34 3.17.6 Script Functions 34 3.17.7 Unix Functions 36 3.17.8 Predefined C Functions 37 3.17.9 Functions Defined in "math.h". 38 3.17.10 Miscellaneous Functions 38 4. Advanced Topics 39 4.1 Adding C library functions to EDEN interpreter 39 4.2 Programming Notes 42 4.2.1 Memory Allocation 42 4.2.2 Stack, Heap and Frame Stack 43 4.2.3 Autocalc 45 1. Introduction The purpose of a computer program is to describe methods of solving certain problems. It is necessary to represent a problem somehow in the program. Many (if not most) programming languages have variables and procedures. The variables refer to some storage spaces to hold data representing objects (e.g. the coordinates of an object). A procedure is a sequence of instructions telling the computer how to calculate the data stored in the variables. Due to the side effects of some instructions, the data can be converted into human readable forms, e.g. the write statement in Fortran can display data in a specified format. An interior designer might like to draw a picture of a room with some furniture inside it on the display screen. Suppose there is a program which allows a user to enter the coordinates of these objects through the keyboard. This program can redraw the objects if the user issues a "refresh" instruction, or more conveniently, redraw the objects automatically after new coordinates have been entered or existing coordinates changed. It is not difficult to write this kind of program in conventional procedural languages, such as Pascal or C. Some existing software, like MaeDraw or some CAD packages, let the user manipulate graphical objects. These packages do the job quite well since they have highly interactive user-interfaces. However the interior designer may require more than just refreshing the display of the objects on the screen; he may wish to specify some of their relative positions. For example, he may also like to put a lamp at the centre of the table no matter where it will be. That means the coordinates of the lamp are determined by the coordinates of the table. In this case, he has to calculate the new position of the lamp and redraw it after it has been moved. Since he may move the table several times to see which position is better, he will be required to repeat the same process -- calculate new coordinates and redraw the objects. Obviously it is a tiresome and error-prone job especially when the number of objects and definitions becomes large. Some software packages, such as MacDraw, allow the user to group several objects into a single large object. The user can then manipulate the object using the operations provided by the packages. The grouping of objects preserves the relative coordinates and attributes of the objects among the same group during the manipulations. Hence, if a table and a lamp are grouped together, translation of the table will also translate the lamp by the same amount. However grouping objects also puts extra restrictions on the objects. For instance, the enlargement of the table will also resize the lamp in the same ratio, and this may not be the intention of the user. The method of grouping objects provides very restricted transformations on objects. Complicated definitions of objects, such as "C lies at the mid-point of line AB, where endpoints A and B are defined independently "' -- (1 - 1 ) cannot be specified by this method. Thus the method of grouping objects can only be considered as a macro manipulation on a number of objects. Therefore it is much more convenient to allow the user to specify the abstract def'mitions of the objects in terms of other objects. For example, the point C described in (1-I) can be defined in terms of points A & B mathematically as below. C=-A+B 2 A system that can manage definitions is called a definitive system or detrmition-based system. The use of a defiuitive system to describe geometric relationships also has another advantage. Many CAD systems allow the user to specify parametric objects, i.e. generic objects whose size and position can be determined by giving parameters. In conventional systems, such objects are represented by small procedures comprising appropriate sequences of drawing instructions, but a method of specifying the relationships end between parts of the object explicitly is preferable. For example, contrast the two specifications of a square with given comer and size below. .... , ~'~ 7".r ~ ~ (2) nefi~tive x,y parametric shape square (x,y, size) I Isize size (i) Procedural moveto x, y lineto x+size, y lineto x+size, y-size lineto x, y-size ;lineto x, y point NE, SE, SW, NW line N, S, E, W N ~ [NW, NE] E ~ [NE, SE] S E [SE, SW] W m [SW, NW] NW m {x, y} NE m {x+size, y} SE m {x+size, y-size} SW m {x, y-size} In the first example, the size and position of the comer are fixed when the symbol is drawn. In the second example, the object specified by the definition: S -- square(X, Y, SIZE) would be re-defined if the value of the variable SIZE were to change. Unfortunately, conventional languages do not provide any built-in facility to help writing a definitive system. Thus the programmer must write a detailed definition manager to handle the definitions. A well-known example of a system based on definitive principles is the spreadsheet software which has recently become so popular in business and personal accounting applications. A spreadsheet is a program that provides us with a large grid of cells into which we can put in numbers and formulae. A cell that contains a formula causes the system to re-calculate the formula whenever any cell on which it depends is changed, and display the new value on the screen automatically. The display action is implicitly invoked by the system after the re-calculation. These actions can be called implicit actions because they are pre-defmed in the system. The make command in UNIX1 is a good example of file maintenance software. Unlike the spreadsheet software, the user has to specify the updating action in a file (the makefile) explicitly. These actions can be called explicit actions. For example, to express the fact that "object file.o is compiled from source file.e', the appropriate makefile is: file.o • file.c cc -c file.c The first line indicates the dependence and the second line contains a UNIX command showing how the object file can be compiled from the source file. The explicit actions allow less restricted updating actions to be defined. For example, the user can give different options to the compiler or use different compilers to compile the target object. However the user has to make sure that there are no destructive side-effects. In a definitive system, each definition has one and only one target object (such as a cell or an object file). Usually there is at least one source objects. The target object is said to "depend on" the source objects because it is computed from them only and whenever the values of the source objects change the target object must be re-computed. No cyclic definitions are allowed since recursively defined definitions lead to meaningless definitions, such as "n - I1 + 1". A spreadsheet program and the make command are driven by definitions. Although make is more like a dependency maintainer (it keeps track of objects by referencing the dependence, but does not refresh the target object), it combines representation of data dependencies with explicit actions resembling those needed to maintain definitions. Actually a definitive system can be implemented using a dependency maintainer with some built-in computing utilities. Internally, a system builds a dependency graph according to the definitions given and uses this information to determine which data has to be updated; from the formula of the definitions, it knows how to compute the targets. A definitive system saves us from remembering what has to be updated when certain data have been changed. It is especially useful in a continuously changing environment; for instance, a programmer would use make to compile those modified modules of a large project under development. The success of spreadsheet programs and make has illustrated the advantages of a definitive paradigm. Another advantage of a definitive system is regarding human-computer interaction. Some definitive systems can be programmed to echo the up-to-date data on the display as soon as certain data has been changed by the user. The user can know the result of the change immediately and then continue to modify the definitions. This interactive behaviour of the definitive system may contribute to applications, such as computer aided design (CAD) systems, that require intensive human interaction. Definitions are also good for specifying relationships between objects. The side-effects of updating actions are able to simulate the actions of objects responding to a change in the environment. An implementation of a definitive system is not a trivial task. As mentioned earlier, it may involve developing a complicated dependency maintainer. It is more sensible to make a general purpose definitive system and build a special purpose system front-end on top of it. Unfortunately no such systems are publicly available. An experimental language EDEN- an evaluator of definitive notations2 - was designed to try to fulfil this purpose. It was designed to be a general purpose language supporting definitions. Users can define their own procedures and functions using some C-like statements. An interpreter can be customized for different applications. The first prototype of such an interpreter was implemented in 1987. Since that time, another prototype of the same language has been implemented. The difference between these two versions is mainly concerned with the internal evaluation strategy, and should not concern the naive user. EDEN is a hybrid language that combines elements of traditional procedural and definitive languages3. It has both advantages and disadvantages of procedural and definitive languages. This user handbook is a guide to the EDEN programming language. This handbook consists of 4 chapters. Chapter 1 introduces the fundamental philosophy of the definitive paradigm and gives an overview of the EDEN language. Chapter 2 describes all features of the EDEN language. A programming tutorial is given in chapter 3; where there are examples illustrating the chief characteristics of EDEN programming. Chapter 4 deals with some advanced topics including the internal evaluation strategy of the interpreter. It is also concerned with customizing the interpreter. 2 Y W Yung, EDEN- an evaluator of definitive notations, MSc Thesis, Department of Computer Science, University of Warwick, 1989 3 Some people use the term "definition-based" languages --(2-1) 4 2. Fundamental Philosophy 2.1 Concept of EDEN Programming The important ingredients of the EDEN language are the formula definitions and action specifications. In this section, we explain these concepts. 2.1.1 Formula Definition A definition of an object is expressed as a mathematical expression. A variable is called a formula variable if its value is defined in terms of other variables, e.g: V --- ~(Vl, V2 Vn) where v is the name of the formula variable, and d) represent an expression with the variables Vl, v2 ..... Vn involved. The variables Vl, v2 Vn are called the source variables of this definition since the value of v is computed from their values. EDEN supports the concept of definition. A formula definition has the form: v Is (I)(Vl, V2 Vn) The keyword "is" defines a formula variable v whose value is computed from the values of source variables Vl, v2 Vn and the expression on the right hand side, denoted by ~, is the formula of the variable v. In other words, a formula describes how the value of a variable is computed from other data. These formulae are permanently valid (unless they are re-defined). That is, no matter what the values of source variables are, the value of variable v is always equal to tl)(v1, v2, ... , Vn). Thus a formula gives an abstract definition of a variable rather than the explicit value of it. This is the major difference between formula definitions and assignment statements. For example, after executing the assignment v = (I)(v1, V2 Vn) v is equal to ~(Vl, v2 Vn) only after the expression • is evaluated and before any of the values of the source variables is altered. Unlike the assignment statement, the formula definition, itself, is not an executable statement. When and how the formula expression defined in the definition is managed by the system. However, the values of the formula variables are guaranteed to be up-to-date when the variables are examined. Notice that a definition does not imply the inverse definitions (where these exist); for example, the definition (2-1) does not imply Vi IS ~-I(v, V1, V2 vi-1, Vi +1 , Vn) where 1 ~ i ~ n. A definition can be a representation of a real relationship among objects; for instance, "a lamp is placed (somewhere) on a table" can be formulated as: lamp-position is table-position + something A=I B=A 5 If the table is re-positioned the lamp will also be moved because the def'mition has specified the position of the lamp in terms of the position of the table, the system can automatically keep track of these definitions and recalculate lamp-position whenever table-position has been changed. But if the lamp-position is then redefined to lamp-position is desk-position + something we mean to put the lamp on the desk instead of the table. The table remains at the same position because it is not defined in terms of the lamp. Note that the new definition of lamp--position overwrites the previous definition. Hence the lamp can be moved independently by redefining its definition. 2.1.2 Action Specification So far, we have talked about formula definitions without mentioning the updating actions (e.g. renew the display of the content of a variable when it is changed) since there is no default implicit action associated with each definition. However, EDEN provides a way of defining explicit actions. Thus the user can specify special updating actions for different devices. In the rest of this handbook, the term "action" refers to "explicit action" because it is the only kind of action available in EDEN. The following statement illustrates a sample action definition. pro˘ display-v : v { display(v); } The keyword "proc" defines an action, named as display-v, which invoked by the system when the value of variable v, (specified after the colon) is changed. The curly brackets {} enclose a list of statements to be executed sequentially. In this case, there is only one procedure call. The procedure display is defined separately. By calling different functions (with appropriate side-effects, e.g. writeln(...) prints the values of its arguments on the standard output) in the function libraries, display can do different updating actions. This makes EDEN more extensible. Note that the value of v is not passed as a parameter to display--v when the action is invoked, but since it is a global variable, it can be accessed by the action directly. 2.2 The EDEN System The EDEN language has def'mitions, actions and some conventional features. The conventional features are, for example, iterative loops, flow control statements, user-defined functions and read/write variables. The variables can be classified into 4 types: read/write variables, functions, formula variables, and action specifications. EDEN has different statements to assign values to these variables. 2.2.1 Read/Write Variables Read/write variables (RWV) are the same as the variables of conventional languages. As the name implied, the value of a RWV can be assigned and examined by the user. For instance, will assign integer 1 to the RWV A. The value of a variable is referenced if the variable appears in an expression- e.g. the fight-hand side of the assignment statement. For example, puts the value of A into RWV B. Note that, in EDEN, a function is considered as a RWV (although there is a different instruction to define a function, and some restrictions on their definition). Functions will be discussed in the later chapters. fisa+b User l 2.2.2 Formula Variables A formula variable (FV) consists of a data register (DR) and a formula expression (FE) (or simply formula). The formula describes how the value of the data register (DR) can be computed from other variables (RWV or FV). The system will always update the DR using the formula whenever it is possible. This part of the system is called formula data maintainer (FDM). Figure 2-1 gives a block diagram of the formula maintaining subsystem. The value of a formula is equivalent to the value of the DR of that FV. The user can (re-)define the formula of a definition but not the value of the DR. This means that the DR is read-only by the user. In EDEN, the formula of a formula variable is defined using the "is" operator, for example: defines a FV f whose value is always equal to the sum of the values of variables a and b. The expression "a + b" is the FE of f. Note that all variables mentioned in the FE, in this case, a and b, can be either FV's or RWV's. The value ofa FV is referenced if the name of the FV appears in an expression, e.g. the right-hand side of an assignment statement. The syntax of a formula definition is given in the next chapter. Figure 2-1: The block diagram of a formula maintaining sub-system. The arrows show how data might flow. E.g. the user can define a FV by giving the formula to it. As shown in the diagram, the user can examine the value of a DR but not write to it. When the user (re)defines a Fk" or writes to a RWE, the FDM will re-calculate those DR's using their associated FE's. The FDM will read the values of some RWF's and~or DR's if it is necessary. The results will be assigned to the DR's by the FDM. Formula Variables (FV) Data Formula Registers Expression (DR) (FE) Formula Data Maintainer (FDM) Read/Write Variables (RWV) Usel 7 An action specification (AS) is a named sequence of instructions. This sequence of instructions will be invoked by the system whenever the values of any source variables, specified explicitly in a list, are changed. The term "changed" is causally defined. It may mean the value of a variable is different from the previous one, or the value of a variable is overwritten (by the user or by the system) though the value may be the same as the previous value. EDEN takes the latter definition. This definition of "changed" is used throughout this handbook unless otherwise specified. An example of an action definition is: pro˘ print-sum : a, b, c { writeln("a+b+c=", a+b+c); } This action, named as print-sum, will invoked by the system whenever the values of the source variables a, b or c (listed after the colon) are changed. The action will print a string and the sum of the values of these variables on the screen due to the side-effect of the pre-defined function writeln. The action execution manager (AEM) is responsible for invoking the action. This sub-system is called the action execution management sub-system. Figure 2-2 gives a block diagram of it. Figure 2-2: The block diagram of the action execution management sub-system. The arrows indicate the possible data flow in the system. Shaded arrows indicate the data flow due to the side-effects of action execution. From this diagram, we see that the actions are invoked by the AEM. AEM loads the procedure of the appropriate action from the AS and then executes it. The execution may cause a write to a (or some) RW~, or (re)define FV's or AS's (using some tricks). Of course, the action may also access I/0 devices (not shown in the diagram). I Formula Variables (FV) Data Formula I Registers Expression (DR) (FE) J (side-effects) Manager (AEM) Read/Write Variables (RWV) Action Specification (AS) 10 rather than 2.2.4 Execution Strategy EDEN is neither purely declarative (in which case the execution strategy does not affect the result) nor purely procedural (some side-effect are triggered rather than called explicitly). Therefore, the user needs to aware of the execution strategy in order to make the most out of EDEN. EDEN takes precedence of formula evaluation over actions. This means that EDEN will update all the value changes to the variables first. The triggered actions are queued in an action queue. After all the variables are updated, those actions queued will be performed. This will ensure that any reference to a variable will get the most up-to-date value. EDEN also uses a breath-in'st scheme of evaluation. Under such scheme, the interpreter spends more time on scheduling the order of evaluation than a depth-first scheme but it has an advantage of doing the evaluation of variable or the triggered actions only once in each phase. For example: proc dummy : A { /* dummy() will be executed when A changes */ auto i; /* local variable */ for (i = I; i <= i0; i++) { V = i; } } pro~ Print-V : V { write(V, ' '); } A = i; /* change A to trigger dummy() */ the interpreter will produces the output: 12345678910 While the depth-first scheme is good at monitoring the change of values, the breath-first scheme the current implementation using is more efficient in producing the final result. Should the later output is preferred, the user has to force the interpreter to execute the actions in the action queue by a eagerO statement. For the above example, the required result can be generated if dummy is alternatively defined as: p=OC dummy : A { /* dummy() will be executed when A changes */ auto i; /* local variable */ Ło= (i = i; i <= i0; i++) { V = i; eager(); /* force execution of action queue */ } } One should not presume the order of action execution. A general principle is that if the order of execution will significantly affect the result, it is likely that it is not a good program. However, some sort of ordering is necessary. A typical example is the simulation of a clocked system. In a clocked system, there should be actions such as: body */ clock++; 9 pz~˘ clocking4 clock++; } : clock { /* a clocking process */ /* advance the clock */ pr~ devicel: clock { /* action ... /* action body */ } synchronised by clock */ pzoc device2: clock { ... /* action } In this cases, the clocking action should be executed last so that it would not block out other actions to be performed in this clock cycle. To solve this scheduling problem, EDEN has a to-do list (in fact two to-do lists, one currently executing and the other one is really to be executed). The clocking action above should be rewritten as: proc clocking : clock { /* a clocking process */ todo("clock++;"); /* advance the clock in future */ } This will cause the statement: to be saved in the (future) to-do list. After all the triggerred actions and things-to-do in the current to-do list have been executed, the system will perform those statements in the future (now becomes the current) to-do list. In this case, when clock++ is executed, all other actions will be triggered and executed in this 'clock cycle' and another clock++ statement is scheduled to do after all the necessary actions are performed in this clock cycle. 4 The clocking mechanism cannot be replaced by a for/while loop because within a loop the EDEN interpreter will not accept any user input. So the user cannot interact with the simulation. 3. EDEN Language Guide 3.1 Introduction The EDEN language has a number of C-like statements and operators. C programmers may find it familiar but they are still advised to go through this chapter since there are new data types and operators. Not all C operators have been implemented. An EDEN program is a list of statements. Each statement can be one of the followings: formula definitions, action specifications, function (procedure) definitions and C-like procedural statements. program: statement program When a procedural statement is encountered, this statement will be executed. The effect of execution is to evaluate an expression, assign values to variables or call other functions/procedures (these must be defined earlier). The order of statements reflects the order of execution, and thus affects the results of computation. The user is responsible for arranging the statements in proper order to get the correct results. In addition, the EDEN interpreter will do the (re-)calculation of the formula definitions and/or invoke the procedures defined by the action specifications automatically. The order of calculation of formula definitions and execution of actions should not be of concern to the user, and is fully controlled by the interpreter. The dependency-link command is an alternative way of specifying actions. The query command inspects the current definitions of the objects. 3.2 Lexical Conventions There are five classes of tokens: identifiers, keywords, constants, operators, and other separators. Blanks, tabs, newlines, and comments (collectively, "white space") as described below are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent identifiers, keywords, and constants. If the input stream has been parsed into tokens up to a given character, the next token is taken to include the longest string of characters which could possibly constitute a token. 3.3 Syntax notation In the syntax notation used in this chapter, syntactic categories are indicate by italic times roman type, literal words and characters in courier type, and keywords in lmld couxier type. Alternative categories are listed on separate lines. An optional terminal or non-terminal symbo. 1 is indicated by the subscript "opt", so that [ expression-listopt ] indicates an optional expression list enclosed in square brackets. 11 3.4 Comments Comments are arbitrary strings of symbols placed between the delimiters/* and */. The whole sequence is equivalent to a white space. Note that/* * / can be nested. It is useful to comment a block of program with comments in it. Lines beginning with % (no even spaces or tabs before it) are also comment lines (See 3.18). /* This is a comment */ /* --- Begin of comment /* This is a nested comment */ --- End of comment */ % This is a single-line comment Comments are not parts of the executable program, but are used by the programmer as a documentation aid. 3.5 Identifiers (Names) An identifier (name) consists of a sequence of letters and digits. The first character must be a letter. The underscore character - is considered a letter. EDEN imposes no limit on the number of characters in a name, but the implementation of the EDEN interpreter does (about 255 characters). An EDEN keyword cannot be used as an identifier. Examples of identifiers: hello IF varl this-is-a-most-unusually-long-name foO bAr HorseSense var2 -auto- - Upper- and lowercase letters are distinct, so Count and count are different identifiers. An identifier is a symbolic name of a variable, function or other objects. 3.6 Keywords The following identifiers are reserved for use as keywords, and may not be used otherwise: append default func proc auto delete if return break do insert shift came else is switch continue for para while 3.7 Objectss and Lvalues An object is a manipulatable region of storage; an lvalue is an expression referring to an object. An obvious example of an lvalue expression is an identifier. There are operators which yield lvalues: for examples, if E is an expression of pointer type, then *E is an lvalue expression referring to the object to which E points. The name "lvalue" comes from the assignment expression E1 = E2 in which the left operand E1 must be an lvalue expression. Objects can be a read/write variable, a function (procedure), a formula variable or an action specification. 3.7.1 Read/Write Variables A read/write variable (RWV) is a named storage that holds a data. The data can be any one of the data structures described in §3.8. 5 The vocabulary is based upon the C reference manual. The term "object" is NOT used in the OOP sense. There is no need to declare a variable before it is used. Memory storage is allocated when the variable name first appears in a program. Data can be assigned to RWV's using assignment statements (see §3.10.11). For example, V= i; The identifier (name) v denotes a RWV whose value is assigned to integer 1 by the operator =. The semi-colon terminates the statement. The value of a RWV is referenced if its name appears in an expression. If the RWV has not been assigned a value before its value is referenced, the value @ (means "under'reed"; see §3.8.1) is assigned to it automatically. Different typed data can be assigned to the same variable at different time. The EDEN interpreter checks for data type clash at run-time. The interpreter assumes all objects are RWV's until they are defined to be functions, formula definitions or action specifications. 3.7.2 Function Definition A function definition is an object that stores an entry point (address) of a sequence of instructions (the function) which computes a result from the parameters. There are some functions pre-defined by the intepreter. The user can defined his/her own functions using some procedural statements. The value of a function definition is equivalent to the function (address of the instruction sequence). (See §3.8.8) Note that the interpreter allows an identifier designating a RWV to be re-used as a function name but not vice versa, because the intepreter assumes all objects are RWV's initially. 3.7.3 Formula Variable A formula variable is an object that stores a formula expression (unlike a RWV which stores a value). See §3.13.1 for the syntax of a formula def'mition. The value of a formula variable is equivalent to the current value of the formula expression that the variable stores. To optimize the computation speed, the formula variable also stores the up-to-date value of the formula expression. So, whenever, the value of the formula variable is referenced, this value is used. The interpreter is responsible for updating this value. An identifier designating a RWV can be re-used to designate a formula variable and vice versa. 3.7.4 Action Specification An action specification is an object that stores aa entry point (address) of an instruction sequence m the action procedure. See §3.13.2 for the syntax of an action specification. It also stores a list of objects on which the action depends. Whenever the values of these objects (usually are RWV's or formula variables) are changed the action procedure will be invoked by the system. Note that the value of an action specification is equivalent to the address of the action procedure. An identifier designating a RWV can be re-used as an action name but not vice versa. 3.8 Data Structures There are 8 different data types: @, integer, character, floating point, string, pointer, list, and function. Integer and character types are sometimes collectively called integral type. 'A' 'AB' 3.8.1 @ (undefined) The constant e (undefined)6 is a special value. It has a unique (un-narned) data type. Some operators, such as arithmetic operators, accept @ as their operands, and usually @ will be returned. Other operators which do not take @ operands will generate an error. See §3.10 for the discussion on operators. 3.8.2 Integer Decimal integer constants A decimal integer constant is a stream of digits with non-leading zero. E.g. 12 3 valid A12 3 invalid (begins with non-digit) 0 22 3 invalid (begins with zero but is a valid octal) Octal integer constants An octal integer constant is a stream of digits with a leading zero. The digits 8 and 9 have octal values 10 and 11 respectively. E.g. 0456 018 456 A456 valid (= 302 decimal) valid (= 020 octal = 16 decimal) invalid (but is a valid decimal) invalid (begins with non-digit) Hexadecimal integer constants A hexadecimal integer constant is a stream of hexadecimal digits starting with 0x (zero followed by the character x). Letters A to F have the hexadecimal values 10 to 15 respectively. Lowercase letters a to f are equivalent to their corresponding uppercase letters. E.g. OxAB Oxlf AB Olf valid (= 171 decimal) valid (= 31 decimal) invalid (begins with an alphabet letter) invalid (missing x) 3.8.3 Character A character constant is a character enclosed in single quotes, as in ' x'. The value of a character constant is the numerical value of the character in the machine's character set. E.g. valid (value is 65for the ASCII character set) invalid (more than one character) Certain special characters, such as the single quote ' and the backslash \, may be represented according to the following table of escape sequences: " " - the -L (bottom) of some algebras. Th,, bcha-,-io-,a- of g i~ o but not newline NL(LF) horizontal tab HT backspace . BS -. carriage return CR form feed FF backslash \ single quote double quote " null NUL bit pattern ddd \n' \t' \b' \r' kf, \\' \,, 11 , kO' ' \ddd' The escape \ddd consists of the backslash followed by a stream of octal digits which are taken to specify the value of the desired character. If the character following a backslash is not one of those specified, the backslash is ignored. 3.8.4 Floating point7 A floating point constant consists of an integer part, a decimal point, a fraction part, and an exponent part; where an exponent part is an e or E, and an optionally signed integer exponent. The integer and fraction parts both consist of a sequence of digits. Either the integer part or the fraction part (not both) may be missing; either the decimal point or the e (E) and the exponent (not both) may be missingNote that a floating point constant may not contain any embedded blanks or special characters. Some floating point constants are: 1.23 .23 0.23 1. 1.0 1.2elO 1.23e-15 1,000.00 1 000.00 i000 .eO -3.14159 comma not allowed embedded space not allowed decimal point or exponential part needed integer part or fractional part needed this is a floating point expression, not a constant 3.8.5 String A string constant is a character sequence enclosed in double quotes: "this is a string" The backslash convention for representing special characters can also be used within a string. This makes it possible to represent the double quote and the escape character backslash \ within a string. It is not possible for a string to contain the \ 0 (NUL) character since it is used to represent the end of string internally. Therefore, "this is the end\Ohere we passed the end of string" isequivalentto "this is the end" If s denotes a string expression, and i an index (an integer expression) then s [ i] denotes the ith character in the string. (see also §3.9.1) 7 Floating point type was not implemented in the earliest version of the EDEN interpreter. then An individual character of a string can be accessed randomly by giving the index -- an integer expression enclosed in [ ] -- after the string expression. The first character is numbered 1 (not zero). For example, if S s[l] s[i] "abcdef" ; is the first character of s, i.e. ' a' is the ith character of s. It is illegal to index beyond the current number of characters in a string. So s [ 7 ] is an error since s has only six characters. The suffix operator # returns the current number of characters in the string. In the last example, s# is 6. For the empty string, " "# is 0. 3.8.6 Pointer Pointers are the addresses of objects in the memory space. The prefix operator & gets the address of an object. For example, iptr = & int-variable; cptr = & s [ 5 ] ; /* if s is a string, cptr points to the 5th char of s * / The prefix operator * refers to the object which the pointer is pointing to. For example, i = * iptr ; /* i = int-variable; */ c = * cptr ; /* c = s[5] ; */ There are no pointer constants, and no pointer operations except the pointer equality checks. 3.8.7 List The list is the only structured data type in EDEN. Data items of different types can be grouped to form a list using the list formation operator [ ] (see §3.10.3). Commas are used to separate the data. The whole list is considered as a single data item. For example, the list [ i00, 'a', "string", [1,2,3] ] holds four items: an integer (100), a character (' a' ), a string (" string") and a list ( [ 1,2,3 ] ). There are no list constants in the language. The characteristics of the list data type are: [3 Different typed data can be stored in the same list. For instance, L = [ i00, 'a', "string", [1,2,3] ]; L holds four items: an integer (100), a character ('a'), a string ("string") and a list ([i,2,3]). The individual items of a list can be accessed randomly by giving the index (an integer expression quoted by [ ] ) after the list. The first item is numbered 1 (not zero). (See also §3.9.1 .) For example, L [ 1 ] is the first item of L, i.e. 100 (in the previous example). L[4] [2] is2. [3 It is illegal to index beyond the current number of items of a list. So L [ 5 ] is an error since L has only four items. 3.8.8 produces List-related statements are: append, insert, delete and shift statements. The suffix operator # returns the current number of items in the list. In the last example, L# is 4, and L [ 4 ] # is 3. For the empty list, [ ] # is 0. Q The infix operator / / returns the concatenated list of two lists; for example, [1,2,3] // [5,6,7] [1,2,3,5,6,7] O 0 There are no list constants. Function EDEN allows the user to define functions (or procedures). The syntax of deflning a function is: (see §3.12 for the formal description) func identifier { para-aliasopt local-var-declarationopt statement-listopt } Notice that there is no parameter list after the identifier. The parameters form a list named $ (dollar sign). Hence $ [ 1 ] represents the first argument $ [2 ] represents the second argument etc. For convenience, $ In] can be replaced by Sn, where n is a decimal integer constant. The optionalpara-alias gives nicknames to the arguments. The syntax is: pax-, identifier-list ; 021 $ # is the current number of arguments. List operations are applicable to $. All parameters are passed by value. Local variables must be declared at the begi,,ing of the function body and are preceded by the keyword auto (means dynamically allocated). All local variables are RWV's. There is no need to declare the type of local variables because EDEN does the run-time type checking. The syntax is: auto identifier-list ; where identifier-list is a list of identifiers separated by commas. Q The return statement returns the value of the expression. The syntax is: ret~z~ expressionopt ; If the expression is omitted, 8 will be returned. Flowing off the end of a function is equivalent to return 8. Q The value returned can be ignored by the caller. In this case, the function acts as a procedure. The keyword func can be replaced by pro˘. There are no differences between these two keywords. Thus they can be interchanged. The purpose of having another keyword is to self- */ comment the program. It is intended (not restricted) that Łunc should be used for defining side-effect-t~ee operators. For example: lUnC max /* returns the max. value of its arguments */ { para m; /* m is the first argument $i auto i; /* declare local variables */ Łoz (i = 2; i <= $#; if ($[i] > m) m = $[i]~ } retttz~ m; i=i+l) { /* for the other arguments */ /* the ith argument */ /* returns max to the caller */ defines a function named max which returns the maximum value of its arguments. [~ To call a function, the arguments must be put in parentheses preceded by the function name. For example, MaxNumber = max(0,i,j,k) ; evaluates the maximum value of O, i, j and k, and stores the results in the RWV MaxNumber. The parentheses cannot be omitted even if no arguments are passed to the function. Q The function name, itself, denotes an entry point (a pointer) of the function code. It is a valid value (a pointer to a function) that can be used in an expression. Thus, if F = max; then the statement MaxNumber = F(I,3,2); assigns 3 to MaxNumber. Also a function can be put into a list. For example, if G = [ min, max ]; where min is supposed to be a function that returns the minimum value of its arguments, then Num = G[n] (i,3,2); Num will have the value 1 ifn is 1 (G [ 1 ] =min), 3 ifn is 2 (G [ 2 ] =max). Once an identifier is used as a name of a function/procedure/action, it cannot be re-used to designate a RWV/formula variable. This restriction is posed by the interpreter (not by the language) to prevent destroying a function definition accidently. There are some pre-defined functions, such as write, writeln, and exit. (see §3.17) 3.9 Expressions The precedence of expression operators is the same as the order of the major sub-sections of §3.10, highest precedence first. All integer overflows are ignored in the current implementation. Division by 0 causes a run-time error. lvalue: identifier $ $number lvalue [ expression ] * expression expression " ( lvalue ) Identifier were discussed in §3.5. The dollar sign $ is the actual argument list of a function; Snumber is the number-th argument where number is a decimal integer constant. A lvalue followed by an expression in square brackets is a lvalue. The intuitive meaning is that of a subscript. The lvalue must have type string or list. The expression must be of integer type. The unary * operator means indirection: the expression must be a pointer, the result is an lvalue. An expression enclosed in a pair " (open quotes) is a lvalue where the expression must be of string type. The object, referred to by it, is the object having the name identical to the string. For instance, " "k"" is equivalent to the object A. A parenthesized lvalue is a lvalue which is identical to the unadorned lvalue. 3.9.2 Primary expression primary-expression: lvalue ( expression ) primary-expression [ expression ] primary-expression ( expression-listopt ) A lvalue is an simple expression. The value of the object is returned. A parenthesized expression is a primary expression whose type and value are identical to those of the unadorned expression. A primary expression followed by an expression in square brackets is a primary expression. The intuitive meaning is that of subscript. The primary expression must has type string or list. See 3.8.1. A function call is a primary expression followed by parentheses containing a poss~ly empty, comma-separated list of expressions which constitute the actual arguments to the function. The primary expression must be of type function. See 3.7.8. In preparing for the call to a function, a copy is made of each actual parameter (even it is a string or a list); thus all argument-passing in EDEN is strictly by value. A function may change the values of its formal parameters, but these changes cannot affect the values of the actual parameters. On the other hand, it is possible to pass a pointer on the understanding that the function may change the value of the object to which the pointer points. The order of evaluation of arguments is undefined by the language. (Although the current implementation evaluates parameters from left to right, the user should not assume this fact.) 3.10.1 Arithmetic operators Expressions with binary operators group !eft-to-fight. The usual arithmetic conversions are performed. arithmetic-expression: expression + expression expression - expression expression * expression expression / expression expression % expression - expression The result of the + operator is the sum of the operands. The result of the - operator is the difference of the operands. The binary * operator indicates multiplication. The binary / operator indicates division. The binary % operator yields the remainder from the division of the first expression by the second. The two operands must be of integral type (i.e. integers or characters). The unary - operator is the negative of its operand. All arithmetic operators are strict, i.e. they return O if either operand is 0. If both operands of an arithmetic expression are of integral type, the result is always an integer. If any operand is a floating point, the result is a floating point (except for the % operator which takes integral operands only). 3.10.2 String and list operators string-expression: expression //expression list-formation-expression: expression / / expression [ expression-listopt ] expressmn-list: expressmn express~n,express~n-list mteger-expressmn: expressmn# If the operands of / / are of string or character type, the result will be a string which is the concatenation of the two operands. If the operaands are of list type, the result will be a list which is the concatenation of the two operands. If either or both of the operands is @, the result will be @. (see pre-defmed function substr and sublist) The [ ] operator groups the values of the expressions into a list. The expression list is a list of expressions separated by commas. If the expression list is omitted the list returns a null list. 3.10.3 Pointer operator pointer-expression: & lvalue The result of the unary & operator is a pointer to the object referred to by the lvalue. The relational operators group left-to-right, but this fact is not very useful; a ~pr~smn expr~smn <=expression expr~smn >=expression The operators < (less than), > (greater than), <= (less than or equal to) and >= (greater than or equal to) all yield 0 if the specified relation is false and 1 if it is true. The type of the result is integer. The usual arithmetic conversions are performed. Two strings may be compared. Single characters are compared from the left to the right according to their codes in the machine's character set. Note that a string is always terminated by \0, so the shortest string is considered the smaller. Slrings are equal only if their lengths as well as their contents are identical. It is an error if the two operands are of different types, but if either argument in an relational expression is 0 then the value is O. Lists and pointers cannot be compared using the relational operators. 3.10.5 Equality operators equa~-expr~s~n: ~pr~smn==expr~smn expr~smn!= expr~s~n The == (equal to) and != (not equal to) operators are exactly analogous to the relational operators except for their lower precedence. (Thus a [ identifier-list ] ; where each identifier in the identifier list refers to an action (procedure) that depends on the value of the object referred to by identifier. So the action: prOC p : a, b, c { /* action body */ } is equivalent to a~> [ p ] ; b~> [ p ] ; c~> [p] ; Note that the dependency link command adds dependency linkage to actions. If the user wants to remove a particular linkage, the whole action must be re-specified. For example, to remove c from the specification of action p above, p must be re-defined as: proc p : a, b { /* action body */ } 3.14.2 Query The query command has the form: ? L[I] ; The query command prints the definition of the object referred to by lvalue on the stdout of the UNIX environment. The format of printing a RWV is: value RWV-name ~> [ identifier-list ] ; The format of printing a formula variable is: formula-variable-name is expression ; formula-variable-name ~> [ identifier-list ] ; The format of printing a function definition is: Łtmc function-name { function-body } function-name ~> [ identifier-list ] ; where pro˘ may be printed instead of Łuac depends on which word was used by the user. The format of printing an action specification is: pr~ action-name : identifier-list { action-body } action-name ~> [ identifier-list ] where fun˘ may be printed instead ofpr˘~ depends on which word was used by the user. The query command is an ad hoc feature in the language and contains bugs in the current implementation of EDEN interpreters; for instance, prints the definition of L instead of L [ 1 ]. 3.15 Syntax Summary 3.15.1 Expressions expression: primary-expression - expression ! expression & lvalue expression # ++ lvalue lvalue ++ -- Ivalue lvalue -- [ expression-listopt ] expression binop expression expression ? expression : expression lvalue asgnop expression lvalue: b inop : asgnop.. primary-expression: Ivalue ( expression ) primary-expression [ expression ] primary-expression ( expression-listopt ) identifier $ $number lvalue [ expression ] * expression • expression " ( lvalue ) expression-list: expression expression, expression-list The primary-expression operators () [] " have highest priority and group left-to-fight The unary operators * & - ! not # ++ -- have priority below the primary operators but higher than any binary operator, and group fight-to-left. Binary priority decreasing as indicated below. * I % + - II > < >---- <= && and II or The conditional operator 7: has priority lower than the binary operators, and groups fight-to-le~ Assignment operators all have the same priority, and all group fight-to-left. 3.15.2 Statements statement: expression ; function-definition formula-definition action-specification dependency-link query-command compound-statement insert lvalue , expression-1 , expression-2 ; append lvalue , expression ; delete lvalue , expression ; mhŁŁt lvalueopt ; switch ( if ( expression ) statement if ( expression ) statement else statement while ( expression ) statement do statement while ( expression ) ; for ( expressionopt ; expressionopt ; expressionopt ) statement expression ) statement case constant : statement default : statement break ; continue ; return expressionopt ; compound-statement: { statement-listopt } statement-list: statement statement statement-list 3.15.3 Function Definition function-definition: function-declarator function-body function-declarator: func identifier proc identifier function-body: { para-aliasopt local-var-declopt statement-listopt } para-alias : para identifier-listopt ; local-var-decl: auto identifier-listopt ; identifier-list: identifier identifier, identifier-list 3.15.4 Action Specification action-specification: function-declarator dependency-list function-body dependency-list: : identifier-list 3.15.5 Dependency Link dependency-link: identifier-> [ identifier ] ; 3.15.6 Query Command query-command: ? lvalue ; 3.16 Pre-defined Variables The following is a list of pre-defmed read/write variables. The user should not alter the value of these variables, except autocalc. stdin stdout stderr These three variables are pre-defmed file pointers equivalent to those file pointers in C. All of them are of integer type. autocalc If the value of autocalc is set to 0, the mechanism of auto-recalculate of formula definitions will be switched off. In such a situation, formulae and actions will not be recalculated or invoked when the values of those variables on which they are dependent are changed. However, when autocalc is set to a non-zero value, the auto-recalculation mechanism will be back in action. By default, autocalc is set to 1. calc-list In the earliest version of EDEN interpreter, the pointers pointing to the formula variable and action specifications are queued in the list variable talc-list. The interpreter would not update or invoke those queued formula variables and actions when autocalc is switched "off". The following action specification can update these queued formulae and actions automatically when augocalc is on again: proc -autocalc : autocalc { while(calc-list#) { *calc-list[l]; /* sh~ft calc list; /* } } update formula var's */ remove from list */ Note that this action will only be invoked when the auto-calculate mechanism is on (i.e. autocalc is non-zero). This action forces the formula variables to be updated by simply evaluating them (the interpreter always updates the formula variables if they are evaluated). Although this action does not invoke queued actions, the updating of the formula variables will cause the related actions to be invoked. In the new version of EDEN interpreter, all queued formulae and actions would be updated and invoked as soon as the auto-recalculate mechanism is switched on. The references to the formula variables and action specifications are stored in an internal format and hence the variable calc-lisg is not needed. Thus the user cannot alter the list. To inspect the formula variable and action specification queues while autocalc is "off', the user can call the pre-defmed functions, formula-list() and action-list (), which transform the internal information into EDEN's list type. See the next section for the description of these two functions. 3.17 Pre-defined Functions A number of functions are pre-defmed by the EDEN interpreter. These pre-defmed functions cannot be redefined by the users to prevent ruining their definition accidently. The following is a list of pre-defmed functions. The output of the examples are in italic courier typeface. gives int (data) str (data) 3.17.1 I/0 Functions write (...) . .... writeln (...) write and writeln print the arguments on the stdout (standard output file), writeln appends a newline \n at the end. For example, for (i = i; i <= I0; 12345678910 i++) write(i, ' '); writeln ( "sum of gives sum of 1+2 = 3 ", i, '+', 2, " = ", 1 + 2); 3.17.2 Type Conversion Functions type (data) It returns the type of data as a string: Data Type String Returned @ "undefined" "@" integer "int" character "char" string "string" floating point "float" list "list" function "func" procedure "proc" pre-defined function "bui i t in" pre-binded C function "C- func" For example, type ( 1 ) gives "int" If data is an integer, the same value will be returned. If it is a character, the code of the character will be returned. If it is a string, the string will be assumed as a stream of digits, and these digits are converted into an integer which will be returned. If it is a floating point, it will be truncated into an integer before it is returned. If it is a pointer, the address is casted to an integer which will be returned. The function returns 0 otherwise. char ( data ) If data is a character, the same character will be returned. If it is an integer, the character having the code equals to the integer will be returned. If it is a string, its first character will be returned. If it is a floating point, it will be truncated into an integer and then cast to a character before it is returned. The function returns @ otherwise. If data is a string, the same string will be returned. If it is O, the string "@" will be returned. If it is a character, the character will be converted into a string and returned. If it is an integer or a floating point, the string of digits corresponding to the value will be returned. The function returns O otherwise. float (data) If data is a floating point, the same value will be returned. If it is an integer, the floating point having the same value will be returned. If it is a character, the floating point having the same value as the code of the character will be returned. If it is a string, the string will be assumed as a stream of digits, and these digits are converted into a floating point which will gives be returned. If it is a pointer, it will be casted into an integer and then converted to a floating point before it is returned. The function returns 0 otherwise. 3.17.3 String Functions substr ( string string, int from, int to) substr returns a substring of string, from and to are the starting and ending positions respectively; they are integer values. If from is greater than to, the null string will be returned, from must not be small than 1. If to is greater than the number of characters of string, spaces will be added. For example, substr("1234567890", 4, 8) will evaluated to "45 678 ". Examples Results substr("1234567890", 4, 8) "45678" substr("1234567890", 5, 3) "" substr("1234567890", 7, 12) "7890 strcat (string string .... ) strcat returns the string of the concatenation of its arguments. All arguments must be of string or character types. Characters are considered as a string of length 1. If it is called with no arguments, the null string will be returned. For example, s = strcat("Garden", " of ", "Eden", ' '); s has the value "Garden of Eden. " nameof (pointer pointer ) nameof returns the name of object to which the first argument (of pointer type) points. The function returns a string. This function is an ad hoe function. Bugs: If the pointer points to a character of a string object or an item of a list object, the string or the list object name is returned. For examples, Examples nameof(&0bject) nameof(&A-String[l]) nameof(&A-List[5]) Results "Object" "A-String" "A-List" 3.17.4 List Functions sublist (list list int from, int to) sublist returns a sublist of list. from and to are the starting and ending item positions respectively; they are integer values. If from is greater than to, the null list [] will be returned, from must not be small than 1. If to is greater than the number of items in list, @ will be added. For example, Examples Results sublist([l,2,3,4,5], 2, 4) [2,3,4] sublist([l,2,3,4,5], 5, 3) [] sublist([l,2,3,4,5], 4, 7) [4,5,@,@] 1 i stcat ( list list .... ) 1 is t cat returns the list of the concatenation of its arguments. For example, listcat([l,2,3], [4,5, 6], [7,8,9]) [1,2,3,4,5,6,7,8,9] time() ftime() array ( int n, data) array returns a list consists of n items that each is data. If data is omitted, @ will be assumed, data can have any type. For examgle, Examples Results array(3, O) [0, 0,0] array (4) [@, @, @, @] 3.17.5 Time Functions time returns the current time in seconds since Jan 1, 1970 ftime returns the current time in terms of the time elapsed since Jan 1, 1970. ftime is accurate up to milli-seconds. The return value is a list [second, milh'] where second = the number of seconds and milli = the number of milli-seconds in addition to the time elapsed since Jan 1, 1970. gettime() gettime returns the current time. The return value is a list of seven integers, the meaning of these integers are respectively: second 0-59 minute 0-59 hour 0-23 day of month 1-31 month of year 1-12 (different from that of gmtime(3)) year year- 1900 day of week 0-6 (0 = Sunday) 3.17.6 Script Functions apply (func function , list list) apply calls the function specified in the first argument with the second argument as its actual argument (i.e. $). Thus the first and second arguments must be of type function and list respectively. For example, apply(writeln, [1,2,3]); is equivalent to call writeln(l,2,3); execute (string string) execute executes a string as Eden statements. It returns 0 if no errors occurred in the execution of the string, non-zero otherwise. The execution terminates as soon as it encounters a syntax error or run-time error. Note that it is not a macro; the string must be some valid and complete EDEN statements. Also this function can be executed within a function body. For example, proc reset-F { execute("F is A+B;"); } Every time reset F is called, F will be defined to be a formula variable contains the expression "A+B" unless there are run-time errors, such as the circular definition conflict. eager() todo (string string) There is inside EDEN two todo lists, one current one and the other is a store for statements to be executed when the current todo list exhausted, todo is similar to execute except that the todo-statements will only be executed after the current thread of control terminates. For example, in one session you enter: todo("writeln(1);"); writeln("Hello world"); The result will be Hello world 1 include (stringfilename) include works in the same way as execute except it takes in a file instead of a string; the file is specified by the first argument which is a string expression. If the execution is successful, include returns 0. The execution terminates as soon as it encounters a syntax error or run-time error. The file name and line number where the error occurs will be reported by the interpreter. For example, include("utility"); executes the file called u t i i i ty. exit ( int status) exit terminates the program and returns status, an integer, to the parent process or operating system. If status is omitted, 0 is assumed. forget (string name) forget remove the entry of variable whose name is indicated by the string name. forget returns 0 if success, 1 if it can't find the variable that matches the name, 2 if it refuses to remove the entry because there are some definitions or actions depends on this variable. Note: it is very dangerous to remove a variable since there may be some references on the variable that cannot be detected (e.g. a reference within a function body). In such case, the system may produce tmpredicable result. (This function may be removed or replaced in future releases). eager eagerly evaluate/execute all queued defmifion~actions de~ite of the status autocalc. touch (pointer vpl, vp2, vp3 .... ) vpl are pointers to variables, touch puts the targets of variables, pointed to by vpl, vp2, vp3, to the evaluation queue. formula-list ( ) formula-list returns the current list of addresses of formula variables queued. This function is useful to inspect the internal queue of formulae when the autocalc is 0 (off). See §3.16 for the description of autocalc. This function is implemented in the later versions of the EDEN interpreter. action-list ( ) action-list returns the current list of addresses of action specifications queued. This function is useful to inspect the internal queue of actions when the autocalc is "off". See §3.16 for the description of autocalc. This function is implemented in the later versions of the EDEN interpreter. where name type and, symboltable ( ) symboltable returns the current internal symbol table of the interpreter as a list. Each symbol entry is of the form: , [ name, type, text, targets, sources ] text targets sources the object name (a string) a string indicates the type of the variable: "var", "formula", "proc ", "func", "builtin", "Real-func", "C-func". Note that these types are different from the output from type. if the object is a function definition, formula variable, or action specification, then text stores the definitions in textual form. a list of variable names (strings) by which this object is referenced directly. a list of variable names (strings) to which this object refers directly. For example, if the following definitions are deemed .°, aisb+ c; b = 3; f is b * g; s = symboltable() ) ; then, S is: [ [ "a", "formula", "b + c;" [], ["b" "c"] ], "" i I [ "b ...., var , , ["a"], [ ], ... ] This function is useful to inspect the internal queue of actions when the autocalc is "off'. See §3.16 for the description of autocalc. This function is implemented in the new versions of the EDEN interpreter. symbols (string type) symbols returns a list of symbol names which are of the type required. The type may be: "@", "int", "char", "string", "float", "list", "var" - any read/write variable (deemed by assignments), "formula" - any definitive variable (deemed by definitions), "func ", "proc" - user-deemed functions and procedures, "bui 1 t in" - EDEN functions or procedures, "Real- func" - Real C functions, "C-func" - other C functions, "any" - any symbols symboldetail (string name) symboldetail (pointer symbol) symboldetai 1 returns the information of a particular symbol. The information is in the same format as that of symboltable. 3.17.7 Unix Functions getenv (string env ) getenv returns the string of the environment variable env. See getenv(3). put env (string env) putenv sets the environment variable, env should have the form: putenv(3 ). "name=value". See error ( string err-msg ) error generates an EDEN error & print the error message err msg. error-no ( ) error-no returns the last system (not EDEN) error number (an integer). backffnd (string path, cmd, argl, arg2 .... ) backgnd executes a process at background named bypath, backgnd returns the process id (-1 if fail). Bug: backgnd may returns a +ve id even if it can't execute the command. pipe (stringpath, cmd, argl, arg2 .... ) pipe pipes output (stdout) to process named by path. pipe returns the process id (-1 if fail). Bug: pipe may returns a +ve id even if it can't execute the command. get msgq (rot key, flag) get-rasgq gets a message queue using key. flag denotes the permission and options (see msgget(2)). It returns the message queue id (integer), -1 if fail. r~ove-rasgq ( int msqid) remove-rasgq removes a message queue whose id is msg/d. It returns -1 if fail. (see msgctl(2)) send-msg (int msqid, [int msg-type, string msg-text ], int flag) send-msg sends a message (text string) to message queue msqid, msg-type denotes the message type. msg-text is a string (terminated by \0, i.e. at least having length 1. flag = 0 for wait 04000 (octal) for no wait (see msgsnd(2)). It returns -1 if fail. receive..-msg ( int msqid, msg--type, flag) receive-msg receives message of msg--type from message queue msqid, flag: c.f. send-msg and msgsnd(2). It returns O if fail, else [m--type, mtext] where mjype is the actual message type received and mJext is the text string received. 3.17.8 Predefined C Functions The following list of functions are pre-binded C functions. Users should refer to their own reference manual. Funcdon fclose fgetc fgets fopen fprintf fputc fscanf gets pclose popen putw setbuf sprintf sscanf Description close a file get a character from an opened file get a string from a file open a file print to a file print a charactoer to a file formated read data from a file get a string from stdin close a pipe open a pipe put a machine word to a file set the buffer size of a file print to a string formated read data from a string pack ( ... ) system ungetc srand rand execute command in a sub-shell unget a character random num~i~'g~tor~ random number generator 3.17.9 Functions Defined in "math.h" Math L~: sin cos tan asin acos atan atan2 sqrt pow log log2 logl0 exp exp 2 exp i 0 Different releases of the EDEN interpreter support different C function packages, such as CURSES (standard UNIX windowing package) and SunCore (standard SUN workstation graphics package). The user should consult to the release notes and the packages' reference manuals for the descriptions of the functions. 3.17.10 Miscellaneous Functions debug ( int status) If status is a non-zero integer, debug turns on the debugging mode. When the debugging mode is on the interpreter prints out some information showing the status of the machine. Default: off (0). For example, debug(l); /* turn on debugging mode */ It is a subroutine for developing the interpreter. It is available to the user only if the interpreter was compiled with the flag -DDEBUG. pack allocates a continuous memory space on the heap and stores the data in this memory block. The values are packed in a machine-dependent fashion, so the function is not portable. Data is packed as C type:- integer is considered as int (4 bytes) real float (4) character char (1) string/pointer/function/etc char * (4) pack returns the address of the beginning of memory block as an integer. Example: x = [0.0, i00.0, I00.0, 0.0]; /* x-coordinates */ y = [0.0, 0.0, i00.0, i00.0]; /* y-coordinates */ polygon-abs-2(pack(x), pack(y), 4); /* will draw a filled square */ /* in fact, you could do: polygon-abs-2(pack([0.0, i00.0, I00.0, 0.0]), pack([0.0, 0.0, i00.0, i00.0]), 4); */ move-abs-2(0.0, 0.0); polyline-abs-2(pack(x), pack(y), 4); /* will draw a hollow square */ /* c.f. SunCore Reference Manual */ Bugs: It is very ad hoc, and not universal, i.e. cannot handle all struct. It converts real numbers into "float"s instead of"double"s (so we can call Suncore polyline & polygon primitives). There should be an "unpack" function. o 4. Advanced Topics 4.1 Adding C library functions to EDEN interpreter Although the EDEN interpreter has a number of pre-defined functions, such as substr or writeln, and allows the user to define his own functions within the EDEN environment, the user may find restrictions in special purpose applications, such as a graphics application. The user may find some useful functions which are already written in C. Unfortunately, with the current implementation, there are no ways for the EDEN user to call C (or other languages) functions without recompiling the interpreter. Some efforts had been done to minimise the difficulties of interfacing with C functions so that the user can call a C function directly from the EDEN environment. A built-in EDEN/C interface can bind a C function to an EDEN name. However, due to some technical problems and the differences in the method of argument passing, not all C functions can be called by the EDEN interpreter through the EDEN/C interface. The limitations are: The EDEN/C interface assumes that all C functions return values of integer type. So, the C functions must return integers or characters since C automatically casts characters to integers. Functions which return pointers, including character pointers, are accepted only if the pointers are compatible with integers (i.e. they occupy the same amount of memory). Functions which return floating points or structures are not allowed by the current implementation. o The interface cannot handle macros (e.g. getchar0) unless they are rewritten as true C functions. Arguments of integer, character, string and pointer type can be passed to the C functions. If the argument is of string type, the character pointer which points to the first character of the string is passed. Arguments cannot contain values of list type. All arguments are cast to integers before they are passed to the C functions. If data types are integer-incompatible, i.e. require a different amount of storage, then they may cause problems. Since the compatibility of data types varies from compiler to compiler, the user should check it out themselves. o Only the fast 10 arguments can be passed. (floating point number counts two) Despite these limitations, the EDEN/C interface can handle a large number of existing C functions. Sometimes, the user can write a simple C function to "bridge" with a C function excluded by the limitations. For instance, the user can write a C function, e.g. BridgeFunc, to interface with the actual C function, e.g. ActualFunc, which requires a struetttre as its argument: struet point { int x; hit y; }; ActualFunc(p) struet point p; {...} BridgeFunc(x, y) int x, y; { struet point p; p.x = x; p.y = y; return ActualFunc(p); Then the bridge function is bound instead of the actual function. (3) (1) The user must edit the header fi~e,:~.:$~;~:~!:~i~l~eden/Misc/customlib:~ .h, to add the C function declarations which must be enclosed in the two lines: #if INCLUDE ~ 'H' #endif For example, #if INCLUDE ~ 'H' extern system0; extern fprintf0; extern char * fgets0; extern double sin(); extern double cosO; #endif (2) In the same file, customlib.h, add the binding information entry of the form: { "EDENname", (int(*)O)C-function--name }, where EDENname is a valid EDEN identifier and C.functionname is the C function name declared in (1). The two names can be different. "(int(*)0)" is used to cast the C function to an integer function. Don't forget the commas at the end. The binding information must enclosed in #if INCLUDE ~ 'T' #endif For example: #if INCLUDE == 'T' { "system", system }, { "fprintf', fprintf ), { "fgets", (int(*)0)fgets }, /* ... etc ... */ #endif If the function returns a double precision floating point, the binding information shall be enclosed in #if INCLUDE == 'R' #endif For example: #if INCLUDE ~ 'R' { "sin", sin }, { "COS", COS }, /* ... etc ... */ #endif Recompile EDEN following instructions in the file $PUBLIC/tkeden/INSTALL. Don't forget to add the links in Imakefile (or Makefile as appropriate) if the new functions added require links to object libraries other than what have already included. A set of macros is defined, at the beginning of customlib.h, to reduce the complexity of binding C functions to EDEN names. These macros are SameFunc, Function, SameReal, RealFunc, and SpecialF: SameFunc(Name) Bind a C function named as Name to the EDEN name identical to Name. The value returned by the C function must be of integer type. Function(Ename, Cname) Bind a C function named as Cname to the EDEN name Ename. The value returned by the C function must be of integer type. SpecialF ( Ename, Type, Cname) Bind a C function named as Cname to the EDEN name Ename. Type specifies the type of value returned by the C function. RealFunc( Ename, Cname) Bind a C function named as Cname to the EDEN name Ename. The C function must be a double precision floating point function. SameReal(Name) Bind a C function named as Name to the EDEN name identical to Name. The C function must be a double precision floating point function. These macros reduces the two-part information into a single macro. To specify a binding, just append the macros at the end of customlib.h. For example: /* customlib.h */ SameFunc(system) SameFunc(fprintf) SpecialF(fopen,FILE *,fopen) Function(getchar,my-getchar) RealFunc(sin,sin) SameReal(cos) ... Warning: No white space is allowed in the macros (except in Type), and no other punctuations (including commas and semi-colons, but excluding comments) can follow these macros. The listing of the macros is given in Listing 4-2. Note that some C preprocessors may not work properly with these macros. Listing 4-2: Macros definitions for binding C functions to EDEN names Note that some C preprocessors may not work properly with these macros #if INCLUDE == 'H' #define SameFunc(Name) extern Name(); #define Function(Ename,Cname) extern Cname(); #define SpecialF(Ename,Type,Cname) extern Type Cname(); #define RealFunc(Ename,Cname) extern double Cname(); #define SameReal(Name) extern double Name(); #endif #if INCLUDE == 'T' #define SameFunc(Name) {"Name",Name), #define Function(Ename,Cname) {"Ename",Cname), #define SpecialF(Ename,Type,Cname)("Ename",(int(*)())Cname), #define RealFunc(Ename,Cname) #define SameReal(Name) #endif #if INCLUDE == 'R' #define SameFunc (Name) #define Funct ion (Ename, Cname) ~: :i, #define SpecialF (Ename, Type, Cname) #define RealFunc (Ename, Cname) { "Ename" , Cname }, #define SameReal (Name) { "Name", Name}, # endi f 4.2 Programming Notes 4.2.1 Memory Allocation Memory allocation is automatically handled by the EDEN interpreter. The string and list data types are assigned by copy (i.e. each string has its own copy instead of sharing the same string through pointers as in C). For example, S 1 = "1234567890"; $2 = $1; $2 is now the string "1234567890". If we now do $2[1] = "X"; then $2 becomes "X234567890", but S 1 is still "1234567890". Also the EDEN interpreter allocates the exact amount of memory for the string. Reference beyond the string's or list's memory block (probably through C function caUs) may have unpredictable result. If we pass a string as an argument to an EDEN function, the string is copied, i.e. passed by value. But if we pass a string to a C function, only the address of the memory is passed, i.e. passed by reference. So, the user should be careful in calling C functions. =lll2131,1sl6171819101 01 A string is actually a ''"~ pointer pointing to a Change to ' X' when block ofmernory that S2 [1] = 'x'; contains the string. In EDEN, the assignment operator duplicates the content of the string instead of simply assigns the pointer to point to the string. For instance, if we want to use the C function gets to read in a string from stdin, we must first allocate some memory to a string variable before we call gets: F] 1°l'l'171'l l°l °l In C language, I O~1 s2 = sl; set the pointer ors2 to point to the content of $1 S = substr("", 1,255); gets(S); Stack stack .~ r pointer stack pointer Heap Here we call the function substr to assign 256 spaces (strings are terminated by an invisible NUL character) to S. Then we call gets to read a string into the memory allocated to S. By doing so, we not only allocate memory for gets to used, but also set the data type of S to the string type. 4.2.2 Stack, Heap and Frame Stack [ S = substr("", i, 255) creates a 256 char memory block (255 blanks and a NUL char) gets (S) passes the address of this memory block togots. gets fills the memory with text. The pseduo machine of the EDEN interpreter is a simple stack machine. A stack holds the values of computing. When the machine calls a function, the parameters are pushed on the top of the stack. Actually, the parameters are merged to form a single list datum. If the function requires any local variable, the machine allocates space for their values. When the function returns, the allocated local variables and the arguments stored on the stack are poped; then the value is returned by pushing it onto the stack. (See the figure on the right.) Since the stack is a fixed size memory block, a deep function call may cause the "stack overflow" error. iiiiiiiiiiiiiiiii!i!iiii Stack Before a function is called. Stack When calling a function, the arguments are pushed on the stack. Also each local variable consumes a space on stack. stack pointer lll iiii',iiiii',iii~;~ i Stack When the function returns, the arguments and local variables are poped and the answer is push on the stack. To reduce the frequency of copying the contents of strings and lists, we add another data structure, called the heap, to hold the contents of temporary strings or lists during the string/list operations. The overhead for allocating/deallocating memory on the heap is less than that of malloc/free. The heap is a large memory block. Memory is allocated within this block. A pointer keeps track the top of heap similar to the stack pointer of the stack. The pseduo machine instruction, freeheap, release all the memory allocated by previous computation. The heap is not affected by the return of function call. Freeheap instructions are automatically inserted by the interpreter at the points that it thinks safe to free memory. However, if a computation involves long strings, lists or too deep function called, the machine may not have the chance to free the memory and thus causes the "heap overflow" error. The user may have to modify their program to minimise the load of stack and heap to avoid memory overflow errors. For instance, an iterative function has less demand on stack and heap than its recursive equivalent. stack ptr top of heap temp. space used byŁ2 temp. space used by Ł1 do" In returning from a function call, the machine has to stored the previous machine status, e.g. stack pointer, (pseduo machine) program counter, number of local variables (since local variables were put on the stack), etc. A frame stack holds these information. When a function is called, the current machine status, a frame, is pushed on the frame stack. For example, iffl0 calls f20; the status of stack, heap and frame stack is shown in the figure on the fight. A frame is poped when a function returns. The machine status resume to the previous state. Stack Frame Heap The frame stack keeps track the stack and heap pointers./freeheap instruction (generated by the EDEN interpreter) causes the top-of-heap to return to the bottom pointed to by the top-most frame (e.gŁ2). On the other hand, the stack-pointer is lowered only if the function returns. Since the frame stack is implemented as a small stack, about 64 entries, the number of level of nested function call is 64 approx. This maximum limit is enough for general use, but if a function nested too deep, the machine will generate a "nested too deep" error. Avoid Memory Overflow Since the EDEN interpreter passes list '"oy value", i.e. copies the entire list on heap, it is very eay to overflow the heap if you pass long lists as parameters of a recursive function. To get round this problem, you may consider the use of iterative loops instead of recursive functions, or you may pass the "pointer" of that list to the function, and let the function to copy that list to its local variable. For example, instead of: func foo/* (list) */ { ... foo($1); °°° } func foo/* (list *) */ { auto list; list = *$1; /* copy the list to local variable */ list = ...;/* do the calculation here */ foo(&list); /* pass the pointer of that list */ ... } Also not that the interpreter will try to free the heap space at the end of each statement. Therefore, it would be helpful space-wise to break a long expression into smaller expressions and use temperary variables to hold intermediate computation results. For instance: a = expression; foo(a); uses less heap space than foo(expression); The machine uses malloc to allocate space for strings and lists rather than claiming space from the heap. So the use of strings and Hsts would not normally causes memory overflow. 4.2.3 Autocalc The autocalc variable controls the evaluation scheme of the EDEN interpreter. When it is 1 (default) or nonzero, the interpreter will try to evaluate all definitions (if they are evaluable, i.e. all their source variables are defined); otherwise (autocalc is zero) the interpreter puts the suspended definitions in a queue. Before the interpreter evaluates a definition, it tests whether all its source variables are defined, i.e. their UpToDate flags are true. However, the interpreter is not intelligent enough to identify variables referenced by a definition indirectly. For instance, time F { para N; return N + Z; } Y is F(1); Function F references the (global) variable Z, but the interpreter does not know about that. When Y is defined, the interpreter thinks that Y depends on F only, and proceeds to evaluate Y and thus evaluate F. However, F finds that Z is undefined. In this example, F returns the sum of Z and the first parameter which produces @ when Z is @. However, some operators and most of the statements produce an error when the expected data types are not met. Of course, if we defined Z before the introduction of Y, it may not cause an error. However, if we introduce Z after that of Y, we are in the risk. Hence, the result of computation seems to be sensitive to the order of definitions. Despite of this ordering problem, redefining Z would not cause Y to be updated. This may not be the intended behaviour sometimes. There are two ways to get around it: • Set autocalc to 0 before the introduction of all definitions. After all definitions are defined, set autocalc to 1 again. By doing so, the evaluation of definitions is delayed. For instance, autocalc = 0; /* ... other statements ... */ fune F { para N; return N + Z; } Y is F(1); Zis 10; /* ... other statements ... */ autocalc = 1; However, this method does not solve the re-evaluation problem. • Explicitly give the dependency of Z to F. For example, func F { para N; return N + Z; } proe touch-F : Z { touch(&F); } Y is F(1); Zis 10; Because now change of Z will induce a change to F, Y is indirectly dependent upon Z, and Y can't be evaluated until Z is defined. Also Y will be re-evaluated when Z is re-defined. This point is not true for the former solutiom ]a ~,[!~terpreter should generate such dependency for function specifications. Note that such attempt as: func F: Z { para N; return N + Z; } or rune F { para N; return N + Z; } z ~> IF] to impose dependency would not work because in both cases a change of Z will call the function F (as if it is an action). This is an error because a parameter is needed to calculate F. (RR124) Research report 12