Next: The ASC Scientific Data Model and its DDF Implementation
Previous: Deconvolution by the Multiscale Maximum Entropy Method
Table of Contents --- Search --- PS reprint


Astronomical Data Analysis Software and Systems V
ASP Conference Series, Vol. 101, 1996
George H. Jacoby and Jeannette Barnes, eds.

Design-Led Software Strategy for the 2dF Survey Spectrograph

Keith Taylor, Jeremy Bailey

Anglo-Australian Observatory, PO Box 296, Epping, NSW 2121, AUSTRALIA

Tim Wilkins

Department of Physics, University of Durham, Science Laboratories, South Road, Durham DH1 3LE, UNITED KINGDOM

Keith Shortridge, Karl Glazebrook

Anglo-Australian Observatory

Abstract:

We describe the novel approach used for the software developed for the 2dF survey spectrograph. 2dF is a 400-fiber multi-object spectrograph designed for the Anglo-Australian Telescope. One of its primary missions is a survey of 250,000 galaxy redshifts within two years, and on-line, real-time and finished product data reduction is the requirement driving the software design. To achieve these goals the software makes extensive use of prior knowledge of the optical design of the spectrograph to minimize the number of free parameters involved.

1. Introduction: The 2dF

The 2dF (Taylor 1994) is a fiber spectroscopic system designed to exploit the wide ( diameter) field now available on the 4m Anglo-Australian Telescope. It can handle 400 fibers, feeding two spectrographs simultaneously; moreover it is double-buffered, i.e., there is a second set of 400 fibers such that the next field can be configured automatically while the current field is being observed. Obviously the system is designed for large survey projects; the primary one of these is a survey of 250000 galaxies complete to B = 19.5 through a large swathe of the southern sky. This will take several years and the data rate is very large: about 15 1024 X 1024 multifiber images per night containing a total of 6000 spectra (not including calibration frames).

2. Software Design

The goal is to achieve fully automated data reduction within the typical 30 minute time frame of the next exposure with none of the usual compromises associated with on-line data reduction, i.e., correctly debiased, flatfielded, optimally extracted, etc.). We wish the final product to be as good as, or better, than is usually achieved with careful off-line reduction of fiber data. This goal is made easier to achieve by building our prior knowledge of the design and optical model of the spectrograph into the data reduction system (2dFDRS).

For example, consider the problem of optimal extraction of the fiber spectra from a multifiber image. The conventional approach is to:

  1. Trace the y position of each spectrum as a function of x---this is usually done by centroiding the peak at each x position. There may be options to average together x bins in regions of low .

  2. Fit some kind of smooth function to the trace of the fiber spectrum. Generally the user is given some kind of choice over the function to be fitted (polynomial, spline, etc.) and the number of parameters fit (order, number of spline segments, etc.). In good software there would be an interactive display of the quality of the fit to allow the fit parameters to be optimized.

  3. The marginal object profile to be used for optimal extraction is then computed. Perhaps this is done by fitting a function (e.g., Gaussian) or by averaging in the x direction, or by some combination of both. Again an interactive display is useful.

It is clear this is a very ad hoc process; the user is given various parameters (fit orders, averaging bins, etc.) without much clue as to what might be sensible other than by trial-and-error. Moreover this is a 1D fit () to what is essentially a 2D physical problem (). And there is no ideal solution: for example how can one choose polynomial fit parameters to trace a spectrum into regions where there are few counts (e.g., at wavelengths where the CCD QE is low)? However, one might want to do that; perhaps there might be a vital emission line down there. Finally the degree of interactivity is high: clearly one does not want to do this for 400 fibers due to the possibilities of user error but what is a good fit for one fiber might be bad for another if the degree of distortion varies across the field.

These sorts of problems have led us to a design-methodology which turns this conventional approach on its head. All of these parameters are, in principle, calculable in advance. The curvatures of the fiber spectra are determined by the optics of the spectrograph and atmospheric dispersion as a function of zenith distance, the profile is determined by the shape of the fiber head, etc. Thus the key to achieving these goals in practice is to build into the data reduction system the optical ray-trace model of the spectrograph. So the optimal extraction example is reduced to:

  1. Find the shift between the actual position of the fiber spectra and those predicted by the optical model. Since this is just determined by the placing of the CCD within its dewar this need only be done once when the instrument is mounted on the telescope.

  2. Using the known position of the spectra and the known profiles optimally extract the spectra. Note optimal extraction is actually a maximum likelihood estimate of the signal given a profile known a priori---the usual practice of estimating the profile from the data itself has always been a semi-circular hack.

Such approaches are used throughout the 2dFRS. Another example is wavelength calibration. The optical model can predict the wavelength associated with each CCD pixel, the only uncertainty being the absolute shift due to the mounting of the fiber on the slit-unit. This offset is determined by constructing a template arc spectrum and cross-correlating it with the observed one.

3. General Principles of the 2dFDRS

These ideas have led us to a set of general design guidelines which are used throughout the 2dfDRS:

  1. Invariant systems configuration information (mainly telescope optics, fiber, slit and CCD format) are parameterized and built into 2dfDRS at the design stage.

  2. Derived system constants (things which might be determined per night or per run such as CCD orientation, grating angle) are, when re-evaluated, checked against a historical record of previous values maintained in a database. Any unexpected values are flagged for attention.

  3. All algorithms are written utilizing as much instrumental and configurational information as possible. These are made available by file headers. Parameterized fits to data incorporate as far as possible the physics of the process involved. In the interests of efficiency re-calibrations and re-fitting are to be avoided when a simple spatial shift of an existing fit is possible.

  4. While user interaction is to be minimized, the processing should be made visible to the user through the frequent display of key derived information, giving the user the opportunity to intercept and redirect the processing.

  5. All images are typed (LFLAT is a longslit flat, MFIMAGE is an image containing multifiber spectra, etc.). The user has the option of grouping a subset of images together on which the 2dFDRS can act or leaving the 2dFRS to select the most recent to use for processing.

4. An Object-Oriented Approach in FORTRAN

The 2dF software is organized using the Object Oriented Programming (OOP) methodology. The OOP approach focuses on the classes of objects to be processed and the methods of processing applicable to each class. The classes of objects correspond to the different types of data frames which the 2dF will use (LFLAT, MFIMAGE, etc.), as well as various types of processed or semi-processed data, and other objects such as groups of related data frames.

Each object is represented using the Starlink HDS data format, with an extension to flag the class of the object. Additional extensions may be needed for specific classes.

The methods of a class are provided by a single Fortran subroutine for each class (the subroutine library is therefore referred to as a class library). Normally OOP techniques require an object-oriented language such as C++. What makes possible an object-oriented scheme in Fortran is the presence of an extensible data format (NDF) to represent the data of the objects, together with a standard subroutine calling sequence. This leads to a scheme which provides all the features of OOP, with the additional advantage (compared with OOP languages) of objects which are represented in HDS. The objects are thus on disk (rather than in memory), are in a portable format which can be moved between machines of different architecture, and are compatible with existing (non OOP) data reduction software.

A class library implementing these ideas has been developed at AAO. This providing a set of general classes on which the more specialist 2dF classes can be based, and provides an example of the desired software organization.

Here is a sample code fragment showing how a sample class accesses its methods:

SUBROUTINE CLA_name(OBJECT,METHOD,ARGS,STATUS)
  
INTEGER OBJECT
CHARACTER*(*) METHOD
INTEGER ARGS
INTEGER STATUS

CHARACTER*40 UMETH

UMETH = METHOD                ! Copy method (so we don't alter it)
CALL CHR_UCASE(UMETH)         ! Force it to upper case
IF (UMETH .EQ. 'METHOD1') THEN
   CALL method1_routine (OBJECT,ARGS,STATUS)
ELSE IF (UMETH .EQ. 'METHOD2') THEN
   CALL method2_routine (OBJECT,ARGS,STATUS)
.
.
ELSE
   CALL CLA_base(OBJECT,UMETHOD,ARGS,STATUS) ! Call the base class
ENDIF

5. Summary

We have attempted to show our practical solutions to the general problems of fully automated optimum data reduction for large observing projects. The key ingredient is proper use of prior information about the hardware and optics involved to avoid the ill-posed inverse problem of arbitrary parameter fits. We have found the Object-Oriented approach useful in organizing our algorithms; the methodology is more important than any particular language as demonstrated by our use of this with standard FORTRAN algorithms.

References:

Taylor K. 1994, in Wide Field Spectroscopy and the Distant Universe, The 35th Herstmonceux Conference, eds. S. J. Maddox & A. Aragón-Salamanca (World Scientific), 15


Next: The ASC Scientific Data Model and its DDF Implementation
Previous: Deconvolution by the Multiscale Maximum Entropy Method
Table of Contents --- Search --- PS reprint
Wed Jul 3 08:11:13 MST 1996