Lab 9 Tools for Empirical Modelling

The design and development of good tools for EM is still a topical focus for research. The primary emphasis so far has been on tools such as EDEN that have a hybrid character. To date, EM has largely involved textual representations based on definitive scripts. Functions and actions are specified using traditional procedural code, and the management of scripts exploits the traditional file system both to make models persistent and to manage their construction. This lab introduces additional resources that highlight three key issues relating to the future development of EM tools:

What techniques can be - or indeed need to be - used in order to use definitive scripts for building more elaborate models?
What are the limitations of modelling with definitive scripts and how might these be most effectively addressed by new tools?
What external research on programming and modelling can be helpful in critiquing and improving EM tools?

The extent to which you need to explore these resources depends on your choice of modelling study for your WEB-EM-9 submission, and on what two questions you select from the four optional questions on the Summer 2013 examination paper. One of the optional questions will be concerned with the current status and prospects for EM tools. Some notes and links to illustrative examples to guide your study of the three key issues are given below.

1. More advanced techniques to support modelling with definitive scripts

EDEN and JS-EDEN both support Empirical Modelling as a form of "modelling with definitive scripts". There is an interesting comparison between definitive scripts in EDEN, which are based on special-purpose definitive notations such as Donald, Scout and EDDI, and the scripts in JS-EDEN, in which observables are more closely associated with attributes of JScript objects. You can get a good perspective on some of the issues that arise when modelling with definitive scripts by reviewing the more 'advanced' features of EDEN, as set out here.

Where JS-EDEN is concerned, the prospects for exploiting the rich object models that can be created in JScript in model-building are as yet little explored. It is important to be aware that JScript variables are associated with contexts that are potentially helpful in organising JS-EDEN observables, but can also be a source of confusion. For instance, you should note that a JScript variable declared within a ${{ ... }}$ block as 'var x' or 'function f' in the EDEN Interpreter Window is not being declared in the global context, as it would be if the same JScript code were to be entered via the JScript console. If you intend to declare global JScript variables from within the EDEN Interpreter Window, their definition should not be preceded by a 'var' and 'function' annotation. A new function called declare_jse() has been introduced into the emile library by Nick Pope to make it much simpler to link JScript variables to JS-EDEN observables in flexible ways. This function is specified and documented in the file http://jseden.dcs.warwick.ac.uk/emile/library/declarevar.js-e and its use is illustrated in the associated testfile http://jseden.dcs.warwick.ac.uk/emile/tests/testdeclare.js-e. (The introduction of testfiles of this nature is a good practice to follow when adding new functionality to the JS-EDEN interpreter.)

For more insight into how JScript structure and JS-EDEN structures can be linked, consult the source and the Model Readme of Hui Zhu's 'graph prototype' model. This forms part of Hui's exploratory research into modelling parsing states using definitive scripts which we expect to simplify the implementation of additional definitive notations in JS-EDEN in due course. The graph prototype replicates most of the functionality of the EDEN graph package introduced by Charlie Care that was deployed in making an EDEN version of the DMT in a previous lab. For instructions on loading the model, enter the command include("models/graphPrototype/README.html"); via the EDEN Interpreter Window in JS-EDEN, and follow the instructions that you then find under the 'Model Readme' tab.

2. Addressing the limitations of modelling with definitive scripts

To date, the largest EDEN-like models comprise at most a few thousand definitions, which is very modest in size in comparison with the millions of lines of code in large software applications. An interesting feature of EM is that relatively small scripts can be exceptionally expressive, in the sense that they can support rich exploratory environments that contain just a few objects. This is consistent with the idea that EM can expose subtlety in interaction and interpretation in what appear to be the simplest situations. If EM is to support software development, it will be essential to be able to scale up models. The challenge that this presents to modelling with definitive scripts is apparent even when constructing models with a few hundred definitions. Problems include: generating large families of similar definitions, processing large symboltables, finding identifier names in a flat name space, avoiding name clashes when integrating scripts from different models, recording the many versions of a script as it evolves, and finding an appropriate way to organise the final script into individual files. A systematic review of these and related problems can be found in Nick Pope's doctoral thesis: see section 3.3, with particular reference to p.51-56 and p.61-69.

Object-orientation is a paradigm that has been designed in part to tackle the problems of managing data on a large scale. The challenge of introducing object-like characteristics into EM tools is the subject of ongoing research by Pope, and is to some degree addressed by the Cadence interpreter prototype. The use of Cadence is illustrated in Lab 1 in CS405, as presented in 2009-10, 2010-11 and 2011-12. Cadence represents a much more radical approach to implementing EM principles than the hybrid EDEN interpreter. It is the basis for a 'dynamic structure base' concept that supports an entirely different kind of operating system much better suited to meeting the demands and exploiting the potential of EM - for instance, where handling persistence, concurrency and distribution are concerned. By way of illustration, the calculatorEvans2011 model contains approximately 500K Cadence observables. It can be executed by running ~empublic/bin/cadence in the ~empublic/teaching/ExampleModels directory and entering the following input into the Cadence Input Window:

%include "calculatorEvans2011/run.dasm"

An alternative approach to enhancing the power of modelling with definitive scripts is to develop environments in which to create and amplify structure within scripts. For some initial experiments in this direction, see the Model Readme for the JS-EDEN prototype 'definition factory' currently being developed by WMB. A basic form of the README, which is still under construction, can be accessed by entering the command include("models/defnfactory/README.html"); via the EDEN Interpreter Window. The definition factory prototype is part of a broader vision for a script environment for EM.

3. Related thinking

There are at least two ways in which software developments can reflect EM principles: by explicitly making use of dependency, and by emphasising the need to achieve a close integration between the machine semantics of software and its 'real-world' significance.

Modelling with dependency is quite well-represented in many practical applications. Guest lectures on CS405 from Antony Harfield (2009), Charlie Care (2010) and Karl King (2011) have highlighted different ways in which EM ideas and principles are relevant to software practice. There is a journal devoted to applications of spreadsheets in education. The open-source GeoGebra package is an excellent example of how dependency can be exploited in an educational environment that has dynamic geometry at its core.

Thought-provoking materials can be found via the 'Recent Output' link at Bret Victor's website: http://worrydream.com/. It is instructive to look first at the part associated with the My Principle section of Victor's presentation "Inventing on Principle", to follow this by looking at how Victor's work is referenced by John Resig of Khan Academy in his blog, and to then review Victor's reaction as documented in 'Learnable Programming'. Victor's outlook is particularly interesting in relation to the idea of 'software development as a lived experience' discussed in my Onward'2012 essay. It is an extension of the kind of association between machine and human semantics that David Harel and Rami Marelly are seeking in their Play-Engine approach to software specification.