After reading my first article on CD-ROMs I expect that all of you Rockers from the l960s are busy accessing data and playing the Jimi Hendrix Greatest Hits CD on your lab computers. However, there may still be some lost souls without the benefit of CD-ROM and I feel obliged to direct a few words to them. Then I will describe the Global Ecosystems Database available from the Environmental Protection Agency (EPA) and the National Oceanographic and Atmospheric Adminis- tration (NOAA) in the United States.
Since my last article the price of CD-ROMs has continued to decline. A decent unit can now be had here for less than $350.00 Canadian Dollars. In addition, more and more multimedia systems such as the Dell Dimension or NEC Ready with bundled 486 PC - CD-ROM audiovisual packages are available. Aside from accessing some interesting data sets, why should you lay down your hard earned dollars, pesos, francs, pounds, rubles etc. for a CD-ROM unit? Well, for one thing it is likely that much of the software you will buy in the next few years will be sold on CD-ROM. Microsoft's Windows NT products have already gone to this medium. The reason vendors will make this move is simple; it costs a lot less to produce, package and distribute one CD with up to 680 megabytes of software on it than it costs to produce, package and distribute the equivalent fifty to sixty floppy-disks. There are several benefits for users; software transfer and installation are easier, CDs remain uncorrupted and virus free as they are read-only, and they are easy to store. Will the move to CDs bring software prices down? I doubt it. Perhaps though, there might be a surcharge to get software on 3.5" or 5.25" disks in the near future. Storage space is another inducement to go the CD-ROM route. Once you install DOS 6.0, Windows and few high performance software packages you may find that 100 to 300 megabytes of space on your hard-disk are filled. Keeping data on CD allows you to use the valuable space remaining on your hard-disk for yet more software packages. Storage space is one of the major advantages for acquiring large data-sets on CD-ROM even when the same data might be available through Anonymous FTP.
Okay, you're convinced right? So what do you look for in a CD-ROM unit (besides a bargain-basement price)? First, you don't want to spend too much time waiting for data to be read off the CD. You should get the lowest Average Seek Time unit you can afford (350 ms or less). This should be coupled with a high Data Transfer Rate (at least 300 kbs). So called Double Speed or Multi-Spin units should meet these requirements. A buffer can also be useful in data transfer and 64 kb should be sufficient. Finally, for those thinking about the future, I suggest that you get a unit that is compatible with Multimedia Personal Computing and Kodak Photo PC. I can envision a day when we trade or submit for publication palynological reference photos and other imaged material on CD rather than by sending negatives, prints etc. By the way, that Jimi Hendrix CD will sound particularly awesome when you couple your CD-ROM with a 16 bit sound-card! Now, on to the data.
The Global Ecosystems Database (GED) is produced through the joint efforts of the EPA and NOAA and is available from the Global Ecosystems Database Project, National Geophysical Data Center, 325 Broadway E/GC1, Boulder, Colorado, 80303, USA; PH 303-497-6125, FAX 303-497-6513, EMAIL info@mail.ngdc.noaa.gov. The cost for the CD, a copy of IDRIX GIS software, and three manuals is $100.00 (US). I have been using Version 1.0. The database and IDRIX can run in an IBM-PC/DOS environment. The system must have at least 256 K of memory and support EGA graphics. An Intel 80486 based machine with 640 K and a graphics accelerator card with 8514A graphics would be ideal. A Microsoft mouse is required for IDRIX and a math coprocessor is recommended.
The GED is an amazing tool for any paleoecologist who requires environmental data with which to calibrate (at least at the level of first approximation) transfer functions based on the modern distributions of environmental variables and pollen, diatoms, beetles, etc. The key word for these data is global. All of the data sets include all of the globe with the exception of the very high latitude polar regions. There are 14 nested raster data sets which provide global estimates/observations of a number of environmental variables. These data sets reflect the very hard work of a number of scientists from different disciplines. The data include the Lee- mans and Cramer IIASA mean monthly temperature, precipitation and cloudiness values; Legates and Willmott average monthly surface air temperatures and precipitation, a seasonal albedo set, several global vegetation and land cover data sets, Holdridge life zones, soil and terrain and topographic characteristic data sets, and information on methane emissions. Some of my favourite data are available from the methane compilations by Lerner, Matthews and Fung. For example, I have long had a belief that one could develop a transfer function to retrodict the number of camels in the Sahara based on the relative abundance of Artemisia, Ficus and Poaceae pollen. The GED provides a data set for a global estimate of camel density per square kilometre that is ideal for this! I am still pondering the possibility of a similar approach using the water buffalo data set! Another interesting feature of the disk is the multi-year data sets of vegetation indices based on AVHRR observations from the NOAA-9 and NOAA-11 satellites. The spatial resolution of most of the data sets range from 10 minutes to 1 degree. In addition to these sets there are an experimental global elevation and bathometry data set and an experimental soil data set with 2 minute resolution. There are also vector data sets to provide information on the location of coastlines, islands, lakes, rivers and political boundaries. For all of you espionage fans, you might be interested to know that the vector data comes by way of the U.S. Central Intelligence Agency! There is extensive documentation on data format, sources, reliability and appropriate references in the scientific literature. This documentation alone provides a wonderful source of information. Data file structure is simple and well described, so that extraction of data for further analysis is straight forward.
Although the data provided by the GED is amazing, a particularly attractive inducement for the neophyte data cruncher is the inclusion of a simple Geographic Information Systems (GIS) package with which to display the data, export it and perform some simple manipulations and analysis. The system is IDRIX which is a subset of the IDRISI PC- based GIS developed by Clark University. Two 5.25" floppies are included with the GED. These contain IDRIX and an installation program. The installation is extremely straight forward. After a few minutes the user has available a menu-driven program to display colour maps or orthographic (3d) displays of the GED data - including some time series displays of monthly temperatures, examine files, and export data. In addition, there is on-line help and a copy of the manuals in electronic form. IDRIX allows some simple GIS manipulations such as the ability to: (1) move a cursor across the map and obtain pixel data values and geographic coordinates, (2) use the cursor as a digitizer to trace features on the map and create vector files, (3) overlay raster images with vector files of features such as rivers or vegetation boundaries you have created by digitizing other images, (4) magnify portions of the global maps for detailed analysis of particular regions, and (4) alter colour thresholds and assignments. The danger of all this that there are just so many interesting and colourful data sets it is hard to stay away from the computer after things are up and running!
So, the GED provides a massive amount of data, much of which will be of great interest to palaeoecologists trying to calibrate modern plant and animal distributions with environmental conditions. The addition of IDRIX provides an introduction to GIS and relatively easy way to display and extract data. What then are the problems with the GED? First, it runs extremely slowly on our Intel 80386 based machine. It is painful to wait for images to be displayed. Second, the menu for IDRIX is pretty sparse and you often end up guessing about what will get you out of a certain application etc. The IDRIX manual is also pretty brief, being a scant 25 pages. If you have no experience with GIS or IDRISI you could have some frustrations. Third, you have to have a fair amount of memory to run this. With DOS 6.0, Windows and lots of other stuff cluttering up our machine this can be a problem. We found that we had to use the DOS statement COMMAND /E:521 to get things up and running. IDRIX On-Line Help suggested using either COMMAND /E:521 /P or SHELL = COMMAND /E:521 /P. For some reason, neither of these worked properly when we invoked them on our Dell. A final caveat concerns the data themselves. It is pretty appealing to take these data at face value when you see them so elegantly displayed on the screen. The contributing scientists, the EPA and NOAA went to great lengths to ensure, as much as possible, the integrity of the data. However, it should be borne in mind that many of the data sets were put together to satisfy the requirements of GCM modelers and the level of accuracy and detail that they might have called for may not be as fine as what the calibration of a modern pollen data set requires. Read the documentation very carefully before taking as literal values such as the Amount of Silt in Soil Horizon 1 for a particular pixel representing a site in central Siberia. I use the Global Soil Particle Size Properties as an example here only because one of its authors, Robin Webb, so brutally out- skied me last year while we were conducting observations of late- spring snow conditions and discussing the GED in Colorado!