INQUA Working Group on Data-Handling Methods

Newsletter 10: July 1993

ENVIRONMENTAL DATA BASES ON CD-ROM PART I

Glen M. MacDonald
Department of Geography
McMaster University
Hamilton, Ontario
Canada L8S 4K1
E-mail: gmmacd@mcmaster.ca

Many Quaternary palynologists toil away for endless days counting thousands of pollen grains only to find their prized site reduced to one datum point on a network of sites. They gaze with awe and despair as their data are further amalgamated on maps that look like the stamps of small island nations. Worse still, their beloved data are transformed through one diabolic statistical manipulation or another into an estimate of Thalictrum biomass at 6000 BP or the number of nights in July, 3527 BP when the temperature of freshly formed dew exceeded 15.3755 C. Finally, our exhausted analysts are forced to rest in darkened rooms and ponder enviously the power of Durham, Brown and Boulder. Ah, but take heart! The North American and European Pollen Data Bases are making huge quantities of pollen data available to all. In addition, two of the great technological icons of the late 20th century, the microcomputer and the Compact Disk, can provide even the most humble pollen peeker with untold megabytes of climatic and environmental data with which to compare, manipulate and transfer functionalize all of that pollen data. Oh brave new world!

CD-ROM offers an inexpensive and robust way to package very large amounts of data (hundreds of megabytes per disc). It is likely that CD's will replace most tape units for the hard transfer and storage of large data sets. CD's are cheap to produce and can be run on very inexpensive add-on units to micros. At present microcomputer CD's are read only. However, new technologies in the world of audio CD's are being developed for cheap CD recorders. CD-ROM is becoming the standard method for the release of many large data sets. I hope that I can encourage pollen analysts to consider the potential of CD-ROM for providing vast arrays of environmental information for use in conjunction with pollen data. To this end, I will share some of my experiences as a neophyte CD-ROM user. I will not discuss CD technology except to say it is based on laser scanning of a track on the base of the disc and this means "DO NOT SCRATCH THE BASE OF YOUR DISC". Any disturbance of the base can lead to data destruction. In this communication I pass along some observations on using the CD-ROM data set provided by the WORLD WEATHER DISC ® Version 2.1. The WWD is produced by WeatherDisc Associates Inc. and is marketed by the Association of Wildland Fire, P.O. Box 328, Fairfield, Washington, USA, 99012-0328, PH: 509-283-2397, FAX:509-283-2264. The cost is $295 US.

Before getting to the WWD it is worth considering what is required to access all of that wonderful data from CD-ROM. In our lab we are using a perfectly ancient DELL 333D (based on a 386 processor). If we can use this museum piece, you can assume that just about any 386- or 486-based micro will run the WWD. Your system should have at least 512 k of memory and run DOS 3.1 or higher. The graphics require EGA display. For those of you in the benighted world of Macintosh, a Mac version was under development when we purchased our DOS version. Our CD-ROM is a Sony CDU6205-10 external unit. You can purchase one of these for $500 US or less. My students inform me that it can be wired to play musical CD's and even display the graphics available from some rock CD's such as the Jimi Hendrix Greatest Hits compilation. I have, thus far, resisted acting on this. You will also require Microsoft Extensions ® 2.0 or later and an appropriate device driver. These may come with your CD-ROM unit.

The WWD data are extracted from the archives of the National Climatic Data Center and the National Center for Atmospheric Research in the US. The WWD is mastered to the international ISO 9660 standard and contains approximately 600 megabytes of data. That awesome amount of information provides seventeen individual climate data sets. These data sets range from full station time-series, to mean climate values for airfields, to sea surface temperature time-series. Eleven of the data sets pertain only to the United States. Two data sets are for tropical regions. The remaining four are worldwide. However, read on before you become too enraptured with the endless possibility of pollenclimate transfer functions that this data could allow. Access software is only provided for seven of the data sets. Five of these seven data sets pertain only to the US. The easy to access data sets are:

1) World Monthly Surface Station Climatology: These data are time-series of monthly mean temperature, precipitation and sea-level and/or station pressure for 3265 stations throughout the world. Some records extend back to the 1700's. Many, but not all, contain data up to the mid-1980s.

2) Worldwide Airfield Summaries: This data set includes summaries of monthly and annual temperature, precipitation etc. from 5717 airports throughout the world. The length of record used to calculate means is variable to 1974.

3) US Monthly Normals: These data provide the monthly mean temperature and precipitation normals for 5511 stations in the US.

4) Climatography of the US No. 20: This data set includes normals for a wide variety of climatological measures from 1862 primary, secondary and tertiary stations in the continental US. The normals are calculated for the period 1951-1980.

5) Local Climatological Data: These data are similar to those described above, but contain only 288 primary stations in the US.

6) Climatic Division Data: This data set contains monthly average temperature, precipitation and Palmer drought indices for "climatic regions" (conterminous areas of similar climate) in the US. It is interesting to consider that Pennsylvania is divided into more divisions than California (hmm). The temporal range of these time series is 1895 to the mid 1980's.

7) Daily Weather Observations: These data contain daily values for a number of climatological measures at 205 stations in the US. The time-series generally run from the 1940's to the mid 1980's.

In general the access software is very good. For example, in the case of the World Monthly Surface Station data you can select either Imperial or Metric units, then select the continent of interest, then select a specific country and finally select a station. You then choose the parameters you are interested in and view them as color graphs of either monthly values for a specific year or interannual time series of the values for a specific month. The disc access speed is pretty good even on our old beast, and this makes a nice way to scan the climate data from your region of research. Unfortunately, graphic data presentation is not available for all seven data sets for which access software is provided. You can also view the data as tables and export them as ASCII files. Of course, once you have exported the data you can combine it with pollen data to produce transfer functions, re- sponse surfaces or whatever else you can dream up. Power to the masses!

The WWD manual is informative and easy to use. In addition, it provides good reference information on the data sets. This includes information on the derivation of the data, appropriate publications for further information and citation, and data structure. The information on data structure is particularly important if you are going to use data sets for which no access software is provided.

Despite such a large mass of data and good access software, there are drawbacks with the WWD. Of course, non-US users would like to see more detailed worldwide coverage. For the neophyte, it can be pretty daunting to try to extract information from data sets that are not included in the seven for which WWD provides access software. Many of the individual files in these sets are 60 megabytes or larger! You can forget about calling them up as ordinary Wordperfect® documents or reading them into many microcomputer stats packs! I hope to discuss this aspect of CD-ROM data in a later communication.

Finally, you may find that the records for your favorite station are not complete. For example, for a number of Canadian stations the WWD only provides data up to 1971. Most of these stations are still in operation, and it is a shame not to provide data for the 1970's and 80's. Finally, some of the data sets, such as the Worldwide Airfield Summaries or US Monthly Normals are great candidates for viewing as maps rather than tables and graphs of individual stations.

All in all, the WWD is a useful compilation of data. If you do not want to work too hard and are interested in the seven data sets for which access software is provided, it is a good intro to CD-ROM data appropriate for Quaternary palynologists. I suspect that if you are interested in one specific data set and do not mind writing your own access software you could get much of this info through the US government for a nominal charge. In fact, I will look at a large data set coupled with access software that NOAA has released on CD-ROM in my next instalment.

Now, where did I leave that Jimi Hendrix CD?


Copyright © 1993 Glen M. MacDonald
Home page
Newsletter 10 index
Author index
Subject index
WWW pages by K.D. Bennett