Next: Author Index
Previous: The Electronic Astrophysical Journal: Resource Location and Archive Management
Table of Contents --- Search ---
PS reprint
Rudolf Albrecht
Space Telescope European Coordinating Facility,
ESO, Karl Schwarzschild Str. 2,
D-86748 Garching, Germany
Notwithstanding occasional indications to the contrary, software development in astronomy can not be an end in itself. As there are considerable resources being invested in software we must make sure that these investments are justified by demonstrating that the software we generate supports the astronomical research process.
While the same is true for any hardware development in astronomy it is usually much easier to tie hardware developments to the astronomical research process: hardware normally supports a discrete step in this process. Telescopes, for instance, support the data acquisition step. Software, on the other hand, supports many different steps of the research process, and it is at times difficult to properly define the dependencies.
In order to correctly assess how and in which manner the software development efforts support the astronomical research process it is necessary to clearly define this process. It is interesting that this process, which is today adopted and accepted more or less unquestioned by most scientists, at least among the natural sciences, has only recently been defined in epistemological terms: the model of the research process as developed by Sir Karl Popper (1972) comes closest to what most natural scientists do when they ``do science''; it also has the advantage of firmly rooting this process in the evolutionary history of the human species (Popper 1973), thus explaining the need to discover as a genetic disposition. The question whether this will continue to be so can legitimately be asked (Albrecht 1987).
The research process starts with the input of signals, either through sensory perception, or through measuring devices which register signals which are either too faint or not suited for our senses. We know this step as data acquisition.
The next step is the transformation of the input data into meaningful values, quite often literally the ``data reduction'' from a jumble of instrument dependent individual measurements to a much smaller, coherent and consistent set of parameters. We note that historically this has been the step into which most of our software development efforts have been invested.
By injecting concepts into the collection of parameters we construct models. Concepts range from very simple, such as a linear correlation, to the very complex, like evaporating black holes. The injection of concepts happens spontaneously and associatively, it is a result of the evolution of our brain. It allowed us to introduce order into the world and to function in it. The methods are rather simple: they are similarity and analogy.
Models come in two flavors, hypotheses and theories, the difference being that a hypothesis is an as-of-yet unsubstantiated and incomplete theory. Given the fact that no theory is ever complete it is more correct to say that all models are hypotheses. This is in agreement with the historical observation that even ``wrong'' models served well as good hypotheses in a heuristic sense.
Good models allow to make predictions as to future observations. They also allow to add to our pool of concepts by abstraction and generalization. If a model conflicts with observations we have to discard it. Since we can never be certain that any model will forever withstand the test of future observations Popper concludes that in science we can never demonstrably attain the ``truth''.
Asking the question where in this process the most progress has been made historically we tend to think that it has been in the first step: the introduction of ever more powerful telescopes and detectors, and the opening of more spectral windows has allowed to quite literally include observations of the whole universe into the building of models. I would contend, however, that the most progress has been made in the application of concepts: the scientific revolution during the period of enlightenment removed concepts like that of the supernatural, of magic and of the subjective from our model building tools, which indeed provided us with the very basis of what we today call scientific thinking.
Looking at the contributed papers presented at this meeting and mapping them into the astronomical research process as outlined above (as far as this is possible; several contributions span more than one of these steps) produces some expected and some unexpected results.
Data acquisition, i.e., areas like telescope and instrument control, signal processing, and planning and scheduling are represented by 11 papers. Classical data analysis, including analysis systems, archives and surveys, is represented by 12 papers. This clearly is in line with the historical development.
Moving from data and parameters to models we employ taxonomy and classification. Two papers were devoted to this. Model building itself requires visualization, simulation, and, ultimately, the building and maintenance of knowledge bases; three papers addressed at least some of these aspects.
The generation of concepts and the injection of concepts into bodies of processable data has not been addressed at all, hardly unexpected but somewhat unsatisfactory. While it is attractive to argue that this is the domain of human ingenuity and capacity for insight, indeed the reason which makes science so satisfying for the practicing scientist, it behooves us to remember the enormous progress which the introduction of discipline into this step during the period of enlightenment allowed us to make. It is more than likely that machine based or machine assisted, i.e., very fast, de-individualized and interdisciplinary concept generation and concept matching can produce enormous advances.
On the other hand we see the beginnings of software which assists us in the processing of models: there were four papers, plus a birds-of-feather session on electronic publishing.
What has electronic publishing to do with the processing of scientific models? This is not clear even to most avant-garde protagonists of electronic publishing. However, the explanation is relatively simple. It is derived from the answer to the question: what is the representation form of an astronomical model? In one way or another all astronomical models, or parts of models, are represented through descriptions, which we commonly call publications.
These descriptions have enormous shortcomings. Sure, we have made considerable progress, having done away with irrelevant pseudo-ancillaries like preambles praising the Lord and the emperor. And at least in astronomy we have converged on one main representation language which we call scientific English, the quality of which differs considerably between scientists, limiting their ability to convey, as an author, or to internalize, as a reader, a scientific model.
Electronic publishing has introduced the need for clearly defined terminology, for standards of style and representation, thus making human, and, ultimately, machine assisted model processing possible. Extrapolating this trend we will arrive at model representations which are much better suited to what models really are: models are really knowledge bases, and the ultimate model representation will be through an appropriate knowledge base description mechanism, ideally a meta language which can be mapped into different representations. These representations can be optimized for machine assisted model processing, such as the combination of models (ideally across disciplines), the checking of conflicts between, and of inconsistencies within models. Or, for that matter, the different representations can be different natural languages, even going to different styles (terse or verbose), or to different depth (textbook or popular).
Electronic publishing is going to happen, with or without regard to the above considerations: the slow and ponderous procedures associated with the production of paper-based publications are reason enough for the introduction of electronic techniques. Past first steps to use computers in this process have been unsatisfactory and are quickly becoming anachronistic (like the LaTeX procedures for the generation of this manuscript).
The pioneers of electronic publishing should bear in mind that the ultimate purpose of electronically published material will not just be the fact that they are easy and quick to produce, disseminate, search, find, reference, etc. Already electronic publications are taking on shapes which differ between readers: the inclusion of hyperlinks allow different audiences to view the material in totally different ways, the same paper on, for instance, galaxy formation in the early universe might take one reader, via the embedded hyperlinks, into cosmology, and another reader into stellar evolution. While this undoubtedly constitutes progress the main limitation, as demonstrated above, is the fact that in the end the information is encoded in English prose, which prevents the fast, de-individualized and inter-disciplinary electronic processing. Ways to remove this obstacle need to be investigated.
The author wants to express his thanks to P. Boyce and S. Schaller with whom he had extensive discussions on the subject.
Popper, K. 1972, The logic of scientific discovery, Hutchinson
Popper, K. 1973, Objective knowledge : an evolutionary approach (Oxford University Press)