INQUA - COMMISSION FOR THE STUDY OF THE HOLOCENE Working Group on Data-Handling Methods Newsletter 4, July, 1990 NOTE FROM THE COORDINATOR Jim Ritchie has asked me to assemble the Newsletter while he is on research leave. Enjoy the year Jim! This issue Jim Ritchie and Eric Grimm deal with some of the new and restruc- tured databases of modern/fossil pollen. Warren Kovach describes a mar- velous new version of his very useful multi-variate statistical package MVSP for the IBM PC (or clones). Eric Grimm announces his new programs TILIA and TILIAþGRAPH for the IBM PC/clones. Tilia is an umbrella program that allows the user to manipulate pollen data entered from the keyboard or imported as one of a half-dozen different formats used at various pollen laboratories around the world. The data are displayed and edited in a built-in spreadsheet, and the files may be saved, and/or subjected to various of Grimm's statistics and zonation routines. The really good news is that Tiliaþgraph can then produce what I consider "publication-quality" pollen diagrams on laser or dot-matrix printers. As an anticlimax, perhaps, I finish this issue with a discussion of my programs POLFILE, MAKE_INF, PLOTSITE, and SLOTSEE for IBM/clones that dis- play summary pollen diagrams on the computer screen for reconnaissance work. In future issues of the newsletter I would like to explore the techniques that are being used to record data for later retrieval. Are you storing your data in formal structures using a relational database program, keeping it in spreadsheets, or storing it as tables in a regular word processor--using its "search function" to locate key words? Let me have your comments about what you have found useful. I will be asking John Birks to keep us informed about new programs and useful items in the literature, but I would also like to hear from anyone who has found data-handling programs or techniques that work, AND ALSO THOSE THAT DO NOT WORK. Send me your short (or long) comments, and I will try to get them in the next newsletter. This issue is heavily weighted toward palynology; you can provide the material that better treats the other disciplines. Keith Bennett (Newsletter #3) mentioned using a PSION ORGANISER II computer for recording pollen counts. I bought a model XP without extra memory modules, and used some of Keith's tailor-made programs that store the results on IBM PC disk in my Wisconsin pollen file format. It has substan- tially changed my life! I need bifocals to record notes, and they interfere with my microscope view. Now I tape a taxon key to the far wall, keep the PSION far enough away to see (but still reach), and counting is [almost] fun again. I have the mail and e-mail list on disk, and ask you to check both for accuracy. Send any corrections or suggestions to: Louis J. Maher, Jr. Department of Geology & Geophysics University of Wisconsin 1215 W. Dayton Street Madison, WI 53706 USA Phone: (608) 262-9595 FAX: (608) 262-0693 BITNET: maher@geology.wisc.edu MVSP: A MULTIVARIATE STATISTICAL PACKAGE Warren L. Kovach Palynological Research Centre Institute of Earth Studies University College of Wales Aberystwyth, Wales SY23 3DB U.K. Bitnet/Janet: WLK@ABERYSTWYTH.AC.UK Until fairly recently the use of computers, particularly for statistical analyses, was not for the faint-hearted. In the mainframe way of doing things programs were run in batch mode, with a stack of 'cards' containing numbers in a very strict format that defined the options the user wished to invoke. Interactive use of programs involved typing these 'cards' individually at a prompt or answering a cascade of questions as to which options were desired. In much scientific computing, it was necessary for users to write their own programs. Those who felt comfortable with com- puters made great advances in the use of various com-[*p.1 / p.2*]puter based techniques in their fields. For many, however, the computer was a very formidable foe. The explosion of personal computing and the drive to make the programs as easy to use as possible has lead to the democratization of computing. Much emphasis is now placed on developing programs that are intuitive and simple to use, so that anyone can easily begin taking advantage of the wide variety of analytical tools that computers provide. Making the programs easy to use also leads to less time being required for learning how to use a program and for running particular analyses. This leaves more time for investigating the implications of the results as well as the finer points of the methods. MVSP (A MultiVariate Statistical Package) is a program I have been distributing for the past four years that performs a variety of ordination and cluster analyses. It was written with the basic premise in mind that multivariate methods can be useful in many areas of biology and geology and that, to promote their use in everyday research, there should be a program that is easily available and easy to use. The program should also be flex- ible enough so that the user is not locked into one particular form of an analysis but can choose other options that might suit his or her data better. Based on many of the letters I have received over the past few years, MVSP seems to have succeeded in this goal. Many people have told me that complexity of the larger mainframe programs has put them off doing numer- ical analyses, since so much effort is often required to perform simple tasks. Some of the options provided in MVSP that are not found in most other programs, such as uncentered PCA, have also proved very useful in some peo- ple's studies. There is always the danger that by making a numerical program too easy to use many users will take a 'black box' approach to the analyses, feeding numbers in and getting numbers out without understanding what it is all about. Throughout the manual for MVSP I strongly urge users to sit down and read about the methods before using them and I provide references to a number of books and papers that I think explain the methods clearly at differing levels of mathematical sophistication. I have been adding new features to MVSP, slowly but surely, and will soon be releasing a new version. MVSP ver. 2.0 most importantly addresses the main criticism of the earlier version, the limited size of the data matrices. The new version uses a virtual memory scheme so that any data that cannot fit in memory are temporarily dumped to disk, so that the size of matrices are limited only by available disk space. The user interface has been improved to allow for even easier running of analyses. When an analysis such as PCA is chosen and an input file selected, a menu with all possible options and their default values is presented. These can easily be changed if necessary and then saved to a configuration file, so that next time you run the program those new default options will be reinstated. In this way an analysis can be run with as few as a half dozen keystrokes. The user may also define a number of defaults relating to the output format, such as column width and the number of decimal places to display on the printouts. There are a number of options for data manipula- tion, including a spreadsheet-like data editor and transformations by logarithms, square roots, or Aitchison's logratio formula for proportional data. There is a context-sensitive help system so that pressing F1 will provide a help screen on the currently highlighted option. Of course the numerical procedures have been enhanced as well. The program performs three eigenanalysis ordinations, principal components (PCA), principal coordinates (PCO), and correspondence analyses (CA). The trade-off between accuracy and speed may now be controlled by the user, and other options let the user tailor the analyses to their needs (standardization and centering of the PCA, different weighting in the CA, etc.). There are 18 different distance or similarity measures, including Gower's general similarity measure and four binary coefficients. Seven clustering strat- egies are available and the option to perform stratigraphically constrained clustering on any of these is provided. Diversity indices may also be calcu- lated on ecological data; these include Simpson's, Shannon's, and Brillouin's indices. There is now the option to have scattergrams of the ordination results either plotted using text characters or drawn on the screen in graphics mode, with CGA, EGA, VGA, Hercules, and ATT 6300 graphics modes supported. I am also including a copy of Chris Meachem's excellent PLOTGRAM program. This was developed for drawing cladograms but can be used for plotting dendrograms from MVSP. These can be plotted on the screen, pen plotters, laser [*p.2 / p.3*] printers, or dot matrix printers (the latter thanks to improvements by Joe Felsenstein). I am still in the process of rewriting the manual and doing further testing of the program, but I hope to begin distributing the program at the end of the summer. As before, MVSP will be distributed as shareware, so that copies of the basic version may be freely copied and given to colleagues and students. There will be an enhanced version available for those who make a voluntary monetary contribution to the programming effort. The enhanced version will differ in three ways: the matrix size will be unlimited (the shareware version will be limited to 100x100 matrices), a special version compiled to take advantage of the 80x87 math coprocessor will be available, and a printed manual will be provided (the shareware version will have a somewhat abbreviated manual on disk). The level of the contribution hasn't been set yet but will be well below $100, depending on the cost of producing the manual. I will be sending notification of the release of the new version to all those who have contacted me directly about MVSP. Anyone who wishes to receive notification may send me their name and address and I will place them on my mailing list. Happy computing! [Address at head of article, Ed.] POLLEN DATA BASES, GENERAL AND PARTICULAR Jim Ritchie General. The capacity of modern computers and remote sensing techniques to store, analyse, and display large sets of data on all aspects of the geosph- ere and biosphere, is causing rapid changes in scientific perspective. However, we seem to be culturally and institutionally ill-equipped to deal with these new tools, primarily because their effective use requires collaborative research at unfamiliar levels of interaction. By contrast, our remote colleagues in nuclear physics have long become accustomed to working in consortia. The most serious hazard, in my view, is that the motivation and morale of the individual investigator, not to mention his or her ability to secure research funding, can be undermined by the emergence of large, multidisciplinary enterprises. One apprehension is that the indi- vidual investigator's data will be absorbed into a large "bank" to be used in large, often international projects, to the benefit of others. We must find constructive means of both protecting the interests of the individual and taking full advantage of the exciting opportunities provided through the new technologies. One glimmer of hope is that some granting agencies (e.g. ours in Canada) are edging towards both recognition and encouragement of collaborative research, and we should be helping them to put in place acceptable means of both promoting and evaluating such endeavours. The challenges are, how do we ensure the maximum involvement and scope for innovation and significant intellectual satisfaction, without creating unmanageable bureaucratic monsters; how can funding be arranged to bring all participants together in workshops to ensure complete involve- ment? The COHMAP-I example was notably successful, thanks to the astute guidance of the principals, and a happy mix of personalities. Specifics. Pollen data bases grow apace. In North America, a northern group (Pat Anderson and Linda Brubaker, Seattle; Pat Bartlein, Eugene; Konrad Gajewski, Quebec; and Jim Ritchie, Toronto) are working with a set of modern pollen and climate that includes Tom Webb's Eastern North America data, together with a rich array of sites from Alaska, all of Canada except the western montane areas, and, thanks to Bent Fredskild's generosity, from Greenland. Another regional North American group is assembling a similar data base for the southwestern region, and when that base is in place and available, the prospect of an entire North American pollen and climate data bank will be imminent. [See also next article, Ed.] Pollen-climate networks have been assembled for parts of North Africa, in the eastern sector, centered on Ethiopia, by Raymonde Bonnefille and colleagues at Marseille, and for the central and western region by collab- oration between Henry Lamb, Anne-Marie L‚zine, Jim Ritchie, and Tom Webb. Active discussion of aims and types of data base is in train in Europe, fol- lowing the initiative of Bjorn Berglund and George Jacobson, and a workshop in Lund last year resulted in some tentative proposals. The central issues being discussed are, will the basebase [*p.3 / p.4*] be the traditional, archival type, or will it be a relational base that provides flexible access to many combinations of data? And should the first aim be for a continental base, or a number of regional bases that might later be merged? And what about the questions of proprietary rights to data and "intellectual property"? We need thorough discussion of these difficult issues now. THE NORTH AMERICAN POLLEN DATABASE Eric C. Grimm Increased awareness of the possibility of significant climate change during the next few decades has instilled a sense of the urgency in the interna- tional scientific, political, and lay communities for research into the causes and potential effects of rapid climate change. Accordingly, the National Oceanic and Atmospheric Administration has established the NOAA Climate and Global Change Program. Under this umbrella, the National Geophysical Data Center (NGDC), a NOAA agency, has established the NOAA Program in Paleoclimatology. A critical aspect of this program is the establishment of paleoclimate databases. The National Geophysical Data Center has awarded Eric C. Grimm of the Illi- nois State Museum a grant to establish a North American Pollen Database. Some salient features of the planned database are: * The database will be freely available for legitimate scientific research projects. * The database will be based on IBM-PC compatible microcomputers, as they are far more widespread than any other micro-, mini-, or mainframe computer. Facilities will exist to transfer it to other platforms. * An "Advisory Panel" for will be established and will set protocols for data accessibility, data transfer, and dating standards. Establishment of the database and its Advisory Panel will be announced in relevant publications and newsletters and solicitations for data and collaboration will be sent to palynologists throughout North America. The goal is to make the database the natural depository for pollen data. Attainment of this objective will help insure the timely incorporation of all data into the database as it becomes available. * The database will contain Quaternary-age pollen data from North America. Pollen from other continents will not be excluded, but initially North America will be emphasized. In addition to the basic pollen counts, the database will include site information (latitude, longitude, elevation, site type, sediment type, etc.), pollen analyst, and radiocarbon dates and other age data (e.g. varves, volcanic tephras). Chronologies will be assigned based on the best dating information available. * The database will be relational, enabling easy extraction of data for selected taxa for particular geographic areas and time periods. Commercial relational database software that runs across a variety of operating system and processor platforms will be used, probably INGRES from Relational Tech- nology. INGRES programs can be compiled and distributed in .exe format, thereby facilitating distribution of the database and software to query it. For maximum portability, the database will also be available in the form of simple ASCII text files. * User-friendly spreadsheet and graphics software will be distributed for entry, analysis, and display of pollen data. The pollen spreadsheet program Tilia and companion graphics program Tiliaþgraph are currently in beta testing and will be ready for distribution by commencement of the project. These programs will be highly useful to palynologists and will encourage standard formatting of data, facilitating entry into the database. Charges for the programs will be minimal, covering licensing fees for the graphics software. * The North American Pollen Database is envisioned as part of a global pol- len database. Resulting from a workshop held last summer in Sweden, a parallel European Pollen Database is being established at Marseille, France, under the directorship of Prof. Armand Pons, who has successfully obtained funding. Efforts will be made to develop compatibility between the North American and European pollen databases. * A site register will be developed similar to that used for archeological sites in North America. Such a register will provide information about sites existing for given geographic regions and time spans and will provide data about palynological work carried out at any given site. Palynologists will be encouraged to register sites early in their investigations. Thus, the site register will provide information about studies in [*p.4 / p.5*] progress and unpublished sites. Such information can facilitate fruitful collaboration and in some cases prevent duplication. The objective of year 1 is to incorporate the COHMAP pollen database, now residing at Brown University, into the North American Pollen Database. The North American Pollen Database will be an invaluable resource for not only paleoclimatologists, but also for palynologists and other Quaternary scientists. It will be valuable research tool as well as an archive insuring the preservation of pollen data. TILIA AND TILIAþGRAPH: PC SPREADSHEET AND GRAPHICS SOFTWARE FOR POLLEN DATA Eric C. Grimm Tilia is a spreadsheet program designed for stratigraphic data, especially pollen data. It runs on an IBM PC, XT, AT, PS/2 or compatible. The program is user-friendly and menu-driven. Data are entered in a spreadsheet that displays the matrix of variables (pollen types or other stratigraphic variables) and samples (stratigraphic levels or other samples). The entire matrix can be accessed with the cursor keys, and the value of any cell can be entered or changed. Rows and columns can be moved, sorted, added, and deleted. Tilia stores data in an efficient binary format (the .til file), but can read and write a variety of ASCII (standard text) files. The program calculates sums and percentages, concentrations, and accu- mulation rates with simple menu selections. Up to 26 sums and subsums can be calculated. Percentages can be based on any sum. From series of radiocarbon or other dates, an age for each level can be calculated with linear interpolation, cubic spline interpolation, or by fitting a polynomial. The program also carries out cluster analysis and ordination. Cluster analysis is constrained incremental sums of squares (CONISS), known also as minimum variance, error sum of squares, and Ward's method. A number of data transformations are possible, including a square-root transformation, which results in the chord-distance dissimilarity coefficient. The analysis can be either stratigraph- ically constrained for quantitative zonation or unconstrained, appropriate for surface samples. Ordination procedures are correspondence analysis or detrended correspondence analysis. In addition to pollen data, this program will be useful for other ecological data, particularly in the PC environment. The program allocates memory at run time, and therefore is more flexible than the FORTRAN versions of DCA. Tiliaþgraph is a companion program for producing pollen diagrams. The pro- gram can be run independently or accessed from Tilia, allowing smooth transition from spreadsheet to graphics. Tiliaþgraph is also user-friendly and menu-driven. Optional features include silhouette or histogram style graphs, exaggerated curves, plotting against age or depth, secondary age or depth axes, plotting positions of radiocarbon dates, zones, zone labels, ecological groups, and a CONISS dendrogram. Tiliaþgraph uses the Graphical Kernel System (GKS), a device-independent, ANSI-standard, 2-D graphics package. Device independence implies that the graphics image is independent of the output device (display, printer, or plotter). A computer program called a "device driver" translates the GKS graphics image into a format appropriate for any particular output device. Tiliaþgraph uses a GKS graphics library and device drivers licensed from Graphic Software Systems (GSS), Inc., Beaverton, Oregon, USA. Drivers are available for most displays (Hercules monochrome and color, CGA, EGA, VGA, Super VGA, 8514/A, and others) and most printers and plotters, including dot- matrix printers, HP Laserjet, HP plotters, postscript printers, and more. The diagram can be viewed on the display and changed before being sent to the plotter or printer. The graphics output for most hardcopy devices can be optionally written to file, which can then be sent to the printer/plotter with the DOS copy command. For example, the output for an HP Laserjet can be written to a file, and then that file can be transported via floppy disk or network to a computer that is connected to a Laserjet, where it is output. Minimum computer requirements for Tilia are DOS 2.0 or greater and 2 floppy drives. Greater functionality is possible with DOS 3.x and a hard drive. Tiliaþgraph requires DOS 3.0 or greater, 640 kb RAM, a hard drive, and a graphics card. [*p.5 / p.6*] Wolsfeld Lake Diagram on p. 6 not included. [*p.6 / p.7*] Both Tilia and Tiliaþgraph are available in "Beta" versions, which have most of the planned functionality. Fully functional versions should be available by the end of 1990. Tilia is available for $5 to cover mailing and floppy disks. The cost for Tiliaþgraph (including Tilia) is $200, which includes two GSS device drivers. Additional device drivers are available for $25 each. The GSS device drivers are licensed for a single computer. Users will require at least two device drivers, one for the display and one for a printer or plotter. Programs are available from Eric C. Grimm, Illinois State Museum, Research and Collections Center, 1920 South 10« Street, Springfield, IL, 62703, USA. [grimm@denr1.igis.uiuc.edu] PROGRAMS USEFUL IN THE POLLEN LAB Louis J. Maher, Jr. Several years ago I was awarded a University/IBM grant to develop micro- computer image-analysis software for geology students. I sought programs that could manipulate multi-band digital data like those used in Landsat images. Processing and translating data formats were not a problem, but producing useful images on the EGA and VGA screen proved more difficult. It required either a graphics board capable of storing more information per pixel or more memory to keep the data elsewhere. I took delivery of the new PS2 computers just when the price for memory in the U.S. was artificially high and before special-purpose image-analysis boards were available. While waiting for memory prices to drop, I redirected my software development to use the computational and imaging ability of the computer for examining the visual patterns in palynology data. My palynology students first learn about pollen and how to identify it with a microscope. The class collects a core and jointly process and identify its pollen. Students working with these data traditionally spend hours of tedious labor merely converting their counts to a pollen diagram, and then try to interpret its meaning. It is here that the computer changes the course. Previously students looked at diagrams from pollen journals and tried individually to tie their core into the regional picture. Now the stu- dent can see the names and locations of 100 midwestern pollen sites on a computer-generated index map. After selecting sites thought to be usefully situated, the student creates data files from the master files of raw data, plots the results on the screen, and carries out the necessary statistical correlations to relate the new site to its neighbors. The students are not simply using the computer to look at canned data; rath- er they now have a tool to interact with the data without being overwhelmed and lost in the morass of numbers and the tedium once needed to build up a regional perspective of the changes in vegetation that occurred since the last glaciation. The students actually are able to see details in some sites that were not available to the original authors. The major difference I note is the change in the recent students' perspectives. Rather than putting all the emphasis in the individual class project, they now want to see how that site compares with others in the Midwest. The computer allows a quick visual reconnaissance as they develop and test hypotheses about the causes of the similarities and differences. The student also quickly picks up the relative strengths and weaknesses of pollen evidence. They can recreate in the classroom a sense of the excitement felt by the principals of Tom Webb's highly successful COHMAP Project--while still finding areas that need more work. Introducing the computer to my pollen lab required the concomitant develop- ment of an infrastructure of auxiliary programs. My POLFILE program allows the user 1) to store detailed and independent raw data from any number of sites, and 2) to select specific taxa or groups of taxa from the raw files to make comparable data files for the sites. Following on some of John Birks' early work, I make a distinction between the original master RAW file--say BLUELAKE.RAW, which is an inviolate, ASCII, read-only file containing the original counts of pollen, spores, exotic markers, sample volumes, weights (and to which I use a word-processor to append: field notes, processing details, carbon dates, etc.)--and derived DATA files--say BLUELAKE.DAT, which are assembled to contain just the sub-sample of taxa that are of immediate interest. Both file types have titles as the first line; programs using the files read the whole line with its punctuation into a variable string. The second line contains the information on the size of the data array so that the computer can frugally allocate memory space when the file is read. I generally include 15 to 20 taxa in my DATa files, but any number can be used. After a DATa file's data array, I include information that can be used to enumerate the individual samples in the core, such as "Depth in [*p.7 / p.8*] Cm" or "C14 Age." In some statistical work only the order of the samples is needed rather than the actual spacing. In these cases, the ordinal data are simply ignored; if that information is needed, the using program "knows" the list will appear after the pollen data array. My program PLOTSITE, uses the DATa file to plot a generalized pollen diagram on the computer screen. PLOTSITE needs to know what taxa to display and in what order. (The taxa not displayed are summed and shown as the last plotted column called "Other(x)" taxa; I insist the students display the "Other" category because it gives them warning when significant changes are occurring in taxa they thought less important.) PLOTSITE gets its screen- control INFormation from PLOT.INF, an ASCII file it expects to find in the de- fault directory. This transparent information file allows the user to do creative palynology without programming. Knowing the number and order of the taxa in the DATa file, the student uses the MAKE_INF program to produce PLOT.INF. The MAKE_INF utility asks the user for information it needs, such as the taxa to plot, which to sum together in the "Other(x)" category, and the maximum percentage value any taxon needs. MAKE_INF takes some effort, but it is only done once for most projects, and it allows the user to make incre- mental adjustments in an initial less-than-perfect PLOT.INF file. As an example, assume the student wanted to study the postglacial changes in pollen sites ranging from the Great Plains of South Dakota, southeasterly to Indiana. Fig. 1 shows the PLOTSITE-produced summary diagrams from six well known sites in the Midwest. The taxa are all plotted to the same scale, and in the same order. All start with similar Picea zones and most end with the Ambrosia rise associated with European agriculture. The DATa files con- tained 17 taxa; the "Other(9)" column include Abies, Acer, Alnus, Carya, Juglans, Ostrya, Artemisia, Chenopodiaceae, and Cyperaceae. I leave it to the reader to characterize the pollen differences (and similarities) between Holocene sites along the transect from prairie to deciduous forest. The depth scale is shown by ticks separated by 50 cm; thus the plots show sediment depth in meters and half meters. (If sample age were available for each site, list it in decades, and PLOTSITE's wide ticks will represent 1000- year increments.) To make such a plot, MAKE_INF has to alter the PLOT.INF file with a little trial and error to reserve space for the maximum value a taxon assumes at any site; the Pinus column, for example, needs to be wide to accommodate its early Holocene maximum at Clear Lake. Still, the basic diagram is the result of but a few minutes' work. The student sees the original plots in color, but these can be changed to shades of gray for printing. The default vertical scale adjusts the plot to the monitor screen, but it can be changed to suit special needs; the sites' aspect ratios in Fig. 1 were halved to use less space. PLOTSITE plots only to the computer screen. When hard copy is needed I use- -and recommend--the screen-grabber and editing program "Inset" (Inset Sys- tems, Inc. 71 Commerce Drive, Brookfield, CT 06804 USA; Sales: 800-828-8088; FAX 203-775-5634; cost [*p.8 / p.9*] about US$100; ask about their educational discount. See also John Birk's Newsletter 3 software reference to PIZAZZ, Application Techniques, Inc. 10 Lomar Park Drive, Pepperell, Massachusetts 01463, USA). Several of the common word-processors have screen-grabbing utilities (for example, WordPerfect's GRAB.COM) that save a screen image to a proprietary file which can then be printed. And all the foregoing include printer drivers to match almost any graphics printer you are likely to own. When pollen diagrams are available from two or more sites it is natural to wonder how they correspond--how they "correlate" or "slot" together--in time. This can be complicated; if the data were obtained by different indi- viduals, the diagrams are almost always drawn to different scales, and the taxa are never in the same order. In the early 1970's Allan Gordon devised SLOTSEQ, a Fortran mainframe program that takes two pollen SEQuences and SLOTs them together. (See ref- erence to Birks, 1979.) The program converts the raw counts to proportions and calculates the spectral degree to which each sample in one site differs from each in the other site. A dynamic algorithm sorts through the resultant matrix of dissimilarity coefficients, mapping a route through the matrix that yields minimum total dissimilarity. The mainframe program is text based; it produces pages of numbers that must be plotted to see whether the slotting worked--or even makes any sense. I wanted to adapt SLOTSEQ to a microcom- puter and at the same time build in graphics that would help the user evaluate the suggested correlation. SLOTSEE, a play on the name of the original slotting program, was produced to work with POLFILE, MAKE_INF, and PLOTSITE. (The information file made by MAKE_INF has the same purpose and format for either PLOTSITE or SLOTSEE and differs only in name; if you have a PLOT.INF file that works, simply copy it to a file called SLOTSEE.INF or vice versa. In fact, one should copy a successful *.INF file to one with a discreet name, such as FILE0012.INF; then that file can be copied over to PLOT.INF or SLOTSEE.INF whenever it is needed.) The student selects one of several possible dissimilarity coefficients and loads the DATa files from the two sites of interest. When the statistics are done, a menu provides a series of choices. The user can view pollen diagrams of the two sites, showing the taxa drawn to the same scale and placed vertically on the screen. (These pseudo pollen diagrams simply list the samples in order from top to bottom rather than by actual depth or age as does PLOTSITE.) Or the user can show the two sites' color- coded data actually combined into one composite diagram ordered according to the algorithm's solution. One of the most useful graphic displays is a color-coded "map" of the dissimilarity matrix. The brain recognizes patterns in such a display that are missed completely when the coefficients are presented in tables. The user can also obtain the standard printout offered by the original mainframe version. SLOTSEE may not always produce useful correlations, but it is rare that the user does not learn something new about the sites. I have found SLOTSEE's matrix map is also useful for comparing a site with itself. Fig. 2 is a series of autocorrelation matrix maps of my core from Wisconsin's Devils Lake. The core tops are at the upper left in each map, and the ticks occur at every tenth sample. The individual plots show chord DC values in the matrix that are <= the number under the map. The square patterns show depths in the core where the pollen spectra are similar; that is, they show pollen zones. These squares merge with others as the dissimilarity is increased. The matrix map displays the information often implied with a dendrogram. However, I never trust a dendrogram unless I can see the same pattern in the matrix map. Then I look at the pattern when a different DC is used. But that would be a long story.... I would be pleased to supply free the latest compiled versions of any or all of these programs if the requester provides me with empty formatted disks. (Or send me the programs you find useful, and I will return the disks with mine.) All fit on one 3.5-inch or 5.25 HD disk; two double-density disks are required. You should also tell me whether your IBM/clone has a monochrome or color monitor and what is your computer's graphics capability; the choices being: EGA, VGA, or Hercules. With the exception of SLOTSEE, the pro- grams are also available in low-resolution CGA, although avoid this if you can. Because these programs do a lot of calculations they work fastest if you have a math co-processor. They slow down without one, but still beat doing it by hand! I believe many labs will adopt Eric Grimm's Tilia program and keep pollen files in that format. Eric has generously added a "Wisconsin" format to the file [*p.9 / p.10*] types Tilia will read and write. The "Wisconsin" format is my RAW file. If you have a Tilia file, save it in "Wisconsin" format with a extension ".RAW". You can then read that exported file with POLFILE and make any needed ".DAT" file for PLOTSITE or SLOTSEE. References Birks, H.J.B. 1979. Numerical methods for the zonation and correlation of biostratigraphical data. 99-123 + Appendix 2, (15 p. The SLOTSEQ.FOR listing appears on 13-15 of Appendix 2). In Bjorn E. Berglund, Ed. Vol I. General Project Descriptions. Subproject B: Lake and Mire Environments. Project 158: Palaeohydrological Changes in the Temperate Zone in the Last 15,000 Years. International Geological Correlation Programme. Lund, Sweden. 143 pp + 2 Appendices. (SLOTSEE.EXE is based on the FORTRAN IV program 'SLOTSEQ' by A.D. Gordon. See for example: Gordon, A.D. 1973. A sequence-comparison statistic and al- gorithm. Biometrika 60, 197-200; Gordon, A.D. 1980. SLOTSEQ: a FORTRAN IV program for comparing two sequences of observations. Computers and Geo- sciences 6, 7-20; Delcoigne, A. and Hansen, P. 1975. Sequence comparison by dynamic programming. Biometricka; 62, 661-664. The version of SLOTSEQ in Gordon (1980) differs somewhat from the version listed in Birks (1979) which was used in SLOTSEE.- EXE.) [ Email addresses (not reproduced) extend onto p. 12. ]