Next: Data Distribution and Ph.D. Publication via the World Wide Web
Previous: The Astronomical Software Directory Service: Distributed Documents---Centralized Searchable Index
Table of Contents --- Search --- PS reprint


Astronomical Data Analysis Software and Systems V
ASP Conference Series, Vol. 101, 1996
George H. Jacoby and Jeannette Barnes, eds.

The NCSA Astronomy Digital Image Library

Raymond L. Plante,1 Richard M. Crutcher, Randall K. Sharpe

National Center for Supercomputing Applications, University of Illinois, Urbana, IL 61801

1Astronomy Department, University of Illinois, Urbana, IL 61801

Abstract:

We announce the opening of the NCSA Astronomy Digital Image Library (URL = http://imagelib.ncsa.uiuc.edu/imagelib.html. The mission of the Library is to collect fully processed astronomical images in FITS format and make them available to the research community and the general public via the World Wide Web. Users may search the Library's contents, browse preview images, and download the full FITS images. All items contained in the Library may either be accessed via the HTML interface or through unique URNs. The latter method allows for easy linking to Library items from other databases or hypertext documents on the Web. The Library is expected to provide many benefits not only to users retrieving information but also to authors who add to the images in the Library's collection.

1. Introduction

Images make up an important part of the data in astronomical research, and the Internet provides an excellent opportunity to make this data available to the entire research community. To meet this opportunity, we have established the Astronomy Digital Image Library (ADIL) with the ``books'' of this Library being fully-processed and research-ready images in FITS format. Users may visit the Library via the World Wide Web, search its ``card catalog'', browse images, and download them for further analysis.

Clearly, one measure of a library's effectiveness is the number ``books'' on its shelves. The ADIL currently has over 1800 images in it; however, our capacity is much higher (several terabytes right now). We are relying on the astronomical community to help build up the collection. We hope that the benefits of contributing images will be obvious and that authors will make it a routine part of the process of presenting scientific results.

2. Overview of the User Interface

A user can search the Library for images through the Query Page. This HTML form allows the user to specify a variety of search parameters, including sky position, observing frequency, object name and type, and image origin (e.g., telescope, author, title). When the search is submitted, the user gets back a Results Page which lists the images in the Library matching the search parameters.

The user may download full FITS images from the Results Page if desired; however, she/he usually would rather access the Preview Page. The purpose behind the Preview Page is to provide the user with as much information about the image as possible so that she/he can determine whether the full FITS file is desired. This information includes the title and authors, a digest of the FITS header, and a preview image in the form of an in-lined GIF image (usually subsampled for quick downloading). Since FITS files can be somewhat large and take several minutes to download, ``browsing'' images via their Preview Page is an important part of locating images.

The Preview Page also contains links to more specific information about the image, including the full FITS header. Other important links are those to the abstracts of related articles in the Astrophysics Data System (ADS) abstract database, giving the user access to the science that came out of the image. If an image has more than two axes (e.g., it is an image cube), then the Preview Page provides a link to a Movie Page which allows the user to browse all the planes within the image. Finally, the Preview Page provides links to any related data files that were deposited with the images. These other data files can be of any format; they might be table data, PostScript figures, MPEG animations, etc.

3. General Contents and Structure

Authors deposit images into the Library as collections referred to as ``projects.'' Each project in the Library is associated with at least one published article. A project is made up of: (a) One or more images in FITS format, (b) an abstract that describes the project, the images in the project, and the scientific results of the study, (c) a table of technical data describing the images and the observations made to produce them, and (d) zero or more additional data files associated with project.

Every project, along with each image in the project, is given a unique ``codename''. An example project codename is 95.DR.01; one of the images in this project has an image codename of 95.DR.01.02. The four components of the image codename are as follows: 95 = year of the deposit into the Library; DR = initials of the first author of the project; 01 = first project by the author in that year; 02 = second image in the project.

The purpose of the codename is to provide a concise, unique identifier that makes enough sense to users that they might be able to pass it on to others with minimal error. There is no limit to size of any of the fields, and it is not considered important for the initials to uniquely identify the author; that it is easily remembered in the short-term is the goal.

The use of unique codenames allows projects and images in the Library to be accessed through unique URNs. Such a URN is constructed by prepending http://imagelib.ncsa.uiuc.edu/project/document/ to the codename. For example, http://imagelib.ncsa.uiuc.edu/project/document/95.DR.01 will access the Project Page for our example project codename; this page summarizes all the images in the project. Appending .02 to the URN will access the Preview Page for the second image in the project. In fact, any item in the Library can be directly accessed through a unique URN of this form. For example, a related datafile can be accessed by adding / filename after the codename, where filename is the related data file's name.

This system of unique URNs allows for easy linking to items in the Library from other resources on the Web. These other resources could be electronic articles or astronomical databases, such as ADS. We have encouraged authors to cite the URNs in their printed articles. This allows them to direct readers not only to the image itself but to representations of the data, such as MPEG animations or VRML visualizations, which are not possible to represent on the printed page.

4. Technical Overview

From a technical perspective, the Library is made up of three basic components: the user interface, the database engine, and storage. All preview information is kept permanently on disk as the primary storage for rapid retrieval by users. The secondary storage is (currently) 30 GB of hard disk space used as a cache to contain a subset of the FITS images with preference given to the smaller and most commonly accessed ones. Long-term, or tertiary, storage is provided by the NCSA Mass-Store, a system based on fast (2 MB/s) tape drives. If a user requests a FITS file that cannot be found in the disk cache, it is automatically transferred from tape. Since this happens primarily for large images, the network is expected to be the major bottleneck in the data transfer rather than the tape output.

5. Depositing Images into the Library

It is hoped that authors will make it part of their routine to deposit their processed images when the related paper goes to press. To make a contribution, authors can consult the detailed, on-line instructions from the Library Web site. In general, authors fill out a submission description form, transfer the files to the Library via anonymous FTP, and then notify the Library by e-mail. When the Library receives the e-mail, the project is loaded into the Library. The unique codename for the project is then sent back to author, allowing her/him to access the images directly without searching for them through the Query Page.

6. Current Developments and Future Plans

One important way in which the Library can be made more powerful to astronomers is in its integration with other astronomical databases on the network, such as ADS and SIMBAD. As a first step, the Library has been working with the ADS Abstract Service to set up a variety of cross-links between the two databases. One example of this is the set of links to ADS abstracts that appear on ADIL Project and Preview Pages. ADS also provides similar links from those abstracts back to the Library. In addition, ADIL abstracts themselves are entered into the ADS database as if the Library were a journal unto itself. This allows users to locate images using the ADS interface. Eventually, we intend to use this capability to support simultaneous searching of both databases, taking advantage of the strengths of each.

In the future, we will be investigating techniques for more sophisticated ways of browsing images, such as remote use of the AIPSView visualizer tools or through specialized Java applets. We are also experimenting with 3-D visualization through VRML (Virtual Reality Markup Language) files.

7. Expected Benefits

Like the many astronomical resources now available on the Internet, the basic benefit of the ADIL is its ability to make it easier to conduct astronomical research through easy access to data. ADIL differs from other data archives in that it focuses on calibrated and fully-processed images. These images are taken from the data pipeline at the point closest to the science and are ready for further analysis.

There are a number of ways in which a vast collection of research-ready images could be a powerful research tool. Observers planning a project will have access to previous observations that can aid not only in sensitivity calculations but in exploring new questions to be addressed. New data can be compared with previous observations as part of a multi-frequency study of particular objects. The availability of electronic versions of images and related figures will help astronomers when preparing figures for talks or papers.

The Library provides advantages not just to users looking for images but also to authors who contribute their images to the Library's collection. The ADIL provides a convenient way for astronomers to archive their final, processed images and related data; getting the data back is a click away without having to mess with tapes. It is also a convenient way to share data with collaborators and colleagues without locking up disk space on one's own Web or FTP site. Finally, the Library offers another way to present scientific results complementary to the printed journal. For instance, after spending a paragraph describing a complex feature in the data, one could direct the reader to a URN in the Library to view an animation or VRML visualization.

There are also obvious benefits to the general public, as demonstrated by our recent ``visitor statistics''. The Library is an easy and fun way for people to learn what is new in astronomy. With wide variety of images in the collection, it is fairly simple to gather together a subset of images for an exhibit for non-astronomers.

Acknowledgments:

We gratefully acknowledge financial support for radio astronomy visualization, high performance computing, and digital library work from the NSF and ARPA under grant NSF ASC 92-17384, from NASA under grant NCC5-106, from the National Center for Supercomputing Applications, and from the University of Illinois Astronomy Department through its participation in the Berkeley-Illinois-Maryland Array.


Next: Data Distribution and Ph.D. Publication via the World Wide Web
Previous: The Astronomical Software Directory Service: Distributed Documents---Centralized Searchable Index
Table of Contents --- Search --- PS reprint
Wed Jul 3 08:02:16 MST 1996