INQUA - COMMISSION FOR THE STUDY OF THE HOLOCENE Working Group on Data-Handling Methods Newsletter 5, January, 1991 NOTE FROM THE COORDINATOR As I was assembling the materials for this newsletter, the Bulletin of the Ecological Society of America arrived with the announcement that the Socie- ty's 1990 recipient of the William S. Cooper Award is Dr. James C. Ritchie for his 1987 book Postglacial Vegetation of Canada (University of Cambridge Press). Congratulations Jim. In the same mail was issue No. 4, 1990, of the Newsletter - Committee on Quantitative Stratigraphy (CQS), an organ of the International Commission on Stratigraphy (ICS). It was an interesting issue, which, among other things, contained an announcement of Warren L. Kovach's Multivariate Sta- tistical Package (MVSP) version 2.0 which is now available. Warren des- cribed the pre-release package in our last issue. I noted a number of fa- miliar names on the CQS mailing list. If you are not, you may wish to con- tact the CQS Chairman, Dr. Felix M. Gradstein, Atlantic Geoscience Centre, Bedford Institute of Oceanography, Dartmouth, N.S., Canada B2Y 4A2. For this issue I have asked Dr. Magdalena Ralska-Jasiewiczowa and Dr. Adam Walanus to describe POLPAL, the Polish Palynology Database that they are developing. Dr. Craig A. Chumbley has agreed to describe his very handy program PALYPLOT which produces a publication-quality pollen diagram using one of the less expensive CAD programs. Dr. H.J.B Birks has provided an- other of his valuable summaries of the literature. And I am including some material I have found useful. I ask the readers of this Newsletter to send me information on any of the data-handling techniques that you have used which could be helpful to others. I have the mail and e-mail list on disk, and ask you to check both for accuracy. Send any corrections or suggestions to: Louis J. Maher, Jr. Department of Geology & Geophysics University of Wisconsin 1215 W. Dayton Street Madison, WI 53706 USA Phone: (608) 262-9595 FAX: (608) 262-0693 BITNET: maher@geology.wisc.edu POLISH PALYNOLOGICAL DATABASE (POLPAL) IN COURSE OF BUILDING Magdalena Ralska-Jasiewiczowa Institute of Botany, Krak¢w and Adam Walanus, Institute of Physics Gliwice In spite of the extremely difficult economic situation in Polish Science, the majority of Polish palynologists have access to IBM-PC-compatible com- puters. So we started to construct programs with the main goal of moving tables of pollen data from sheets of paper to magnetic media. To encourage the cooperation of pollen analysts, we decided that the database should be built bottom-up. The software should offer to the user immediate advantag- es in handling pollen data, not just the promise of benefits that might come sometime in the future through participation in a European Database. At the moment Polish Palynological Database (POLPAL) software contains programs for the input of data, numerical and graphical data reviewing and printing, and additionally, some numerical methods for the analysis of a single profile. There also is one short relational-database kind of program which allows searching for a given taxon in multiple profiles. Input of data may be performed at three levels. The first one is the direct entry of counts into the computer while working at the microscope. The second level is input of the sample protocols after the slide has been counted, and the third level is input of tables from sites already complet- ed. In the input of tables (with the capacity up to 300 taxa and up to 300 samples) two simplifications are available, one for low-abundance taxa with grain numbers in the single [*p.1 / p.2*] digits, and the second for taxa with long runs of zeros. At every stage errors may be corrected. The program for graphical data presentation shows up to 50 taxa, or sums of taxa, either as percentage or concentration diagrams. The sequence of taxa may be rearranged, and the vertical and horizontal scales may be chosen arbitrarily. The numerical analysis program contains principal components analysis, correspondence analysis, and constrained-single-link zonation. The taxonomic list now used in the database contains about 600 taxa, the list can be corrected and supplemented. It seems to be necessary to propose to palynologists a standard list of taxon names. The special statistical features of pollen tables are exploited not only in data entering but also in data storing. As a result, on average, only half a byte is necessary for storing one pollen count. In other words one normal-density diskette may contain up to 100 large (200 taxa, 100 samples) pollen tables (Walanus, 1989). The POLPAL program is in operation at the three Polish laboratories: the Institute of Botany Pol. Acad. Sc., Krak¢w, and at Gdaþsk and Lublin Uni- versities. There are about 30 pollen tables stored in the database. The POLPAL program is free for anyone interested in it. The author of the computer code (A.W.) would be happy to cooperate with the possible user in improving the program. Reference Walanus A. 1989. Saving computer memory in storing tables of pollen counts, Pollen et Spores, v. XXXI/1-2, p. 161-164. Adam Walanus Institute of Physics Silesian Technical University Krzywoustego 2 PL-44-100 Gliwice, POLAND (In a letter to the newsletter coordinator, concerning POLPAL and how in- terested parties might obtain copies, Dr. Adam Walanus wrote "...my univer- sity is not yet connected to the 'BITNET.' I believe it will be in the near future. If some BITNET user would be interested in POLPAL please mail it, if it is neither expensive nor troublesome." With that in mind, if persons interested in obtaining a copy of POLPAL can reach maher@geol- ogy.wisc.edu by e-mail, I will send a copy on a single high-density floppy and notify Adam Walanus that I have done so. But I urge you to deal with him directly; programmers benefit from direct contact with those using their programs.) PALYPLOT: A PC-BASED PROGRAM FOR PLOTTING POLLEN AND PLANT MACROFOSSIL STRATIGRAPHIC DATA Craig A. Chumbley New York State Museum Biological Survey Albany, NY 12230 craigc@mts.rpi.edu PALYPLOT is a program that provides a quick, simple method for producing publication-quality pollen and plant macrofossil diagrams with an IBM or compatible desktop computer. The program is menu-driven, simple to use, and allows the user considerable flexibility in the types of drawings pro- duced. PALYPLOT options include percentage, concentration, and pollen accumulation rate diagrams, exaggerated curves, columns showing the positions of sedi- ment types, pollen zones, and radiocarbon dates, and the ability to plot dated sequences either in terms of absolute sample depth or estimated sample age (Fig.1). PALYPLOT calculates estimated ages by linear inter- polation between dated points in the sequence. Diagrams can be plotted using any of several silhouette and histogram variants. Some options can be specified interactively from within the program, whereas others must be specified in a separate control file. Stratigraphic pollen data are furnished to PALYPLOT using Wisconsin .DAT files (Maher, 1990, p. 7).Files in this format are easily created using Maher's program POLFILE. Another possible way of creating .DAT files is by using the program TILIA (Grimm, 1990, p. 5) for data entry and then exporting the data in Wisconsin .RAW format. POLFILE can then create a .DAT file from the .RAW file. Dates, zones, [*p.2 / p.3*] [ Indian Creek Nature Center Pollen Diagram not reproduced. ] [*p.3 / p.4*] and sediment types are read from an ASCII control file. PALYPLOT writes computer-aided-drafting (CAD) instructions to an ASCII text file that can be imported into Generic CADD Level 3 or Generic CADD 5.0. The advantages to this approach are that 1) the diagram can be previewed on a graphics monitor, and, if necessary, modified manually before producing a hard copy, and 2) the CAD program contains drivers for many displays, plotters, and printers, including the HP Laserjet and Postscript printers. CADD 5.0 allows for hardcopy graphics device output to be written to file. PALYPLOT can be run on any IBM PC/XT, AT, or PS/2 or compatible running DOS 2.1 or later. The CAD program requires a hard disk drive, a graphics adapter, and 640K of RAM is recommended (more for large drawings). Generic CADD 5.0 is currently available from mailorder vendors for about $225-$250. PALYPLOT will be available on February 1, 1991 for $5 to cover the cost of disks and mailing. References Grimm, E. C. 1990. TILIA and TILIAùGRAPH: PC spreadsheet and graphics software for pollen data. INQUA - Commission for the Study of the Holo- cene, Working Group on Data-Handling Methods Newsletter 4: 5-7. Maher, L. J., Jr. 1990. Programs useful in the pollen lab. INQUA - Commission for the Study of the Holocene, Working Group on Data-Handling Methods Newsletter 4: 7-10. RE-CONFIGURING YOUR COMPUTER MADE EASY Louis J. Maher, Jr. Palynologists have an embarrassment of riches: two programs capable of producing publication-quality pollen diagrams. While trying them out I encountered a problem. Eric Grimm's TILIAùGRAPH should be run without any "terminate-and-stay-resident" programs like SIDEKICK or PCTOOLS, and its drivers are loaded in the AUTOEXEC.BAT file. Craig Chumbley's PALYPLOT makes use of Generic CADD Level 3 or 5.0, and this should be run with a minimum of other programs loaded to free as much memory as possible for building and editing the resultant drawing (.DWG) file. I started out with separate "boot" disks to supplant my computer's AUTOEXEC.BAT and CONFIG.SYS files which are stored in the root directory of its hard disk. I found a better solution to this problem in a local computer newsletter; the sources are listed below. The scheme involves having several batch files, stored along with your DOS files, which act as switches to load the desired AUTOEXEC and CONFIG files, and then cause the computer to do a "warm boot." Assume you wish to have the choice of three systems, one for running TILIA, a second for running PALYPLOT, and a third for normal operations. Make three ".BAT" files modelled on the following, but changed to fit your specific drive and system. MINSYS.BAT, TILSYS.BAT, and REGSYS.BAT should be saved as ASCII text and put where your DOS files are stored. The term "WARMBOOT" refers to a short file that reboots the computer; it is de- scribed later. REM: MINSYS.BAT MINIMUM SYSTEM ECHO OFF C: CD C:\ COPY \DOS\AUTOEXEC.MIN \AUTOEXEC.BAT COPY \DOS\CONFIG.MIN \CONFIG.SYS WARMBOOT REM: TILSYS.BAT TILIA SYSTEM ECHO OFF C: CD \ COPY \DOS\AUTOEXEC.TIL \AUTOEXEC.BAT COPY \DOS\CONFIG.TIL \CONFIG.SYS WARMBOOT REM: REGSYS.BAT REGULAR SYSTEM ECHO OFF C: CD C:\ COPY \DOS\AUTOEXEC.REG \AUTOEXEC.BAT COPY \DOS\CONFIG.REG \CONFIG.SYS WARMBOOT [*p.4 / p.5*] The following are possible AUTOEXEC.MIN and CONFIG.MIN files that should be stored as ASCII text with your DOS files; modify them to fit your system. Do NOT type the title lines in square brackets. [AUTOEXEC.MIN] ECHO OFF REM: AUTOEXEC.MIN FREES MEMORY PATH C:\;C:\DOS;C:\LEVEL3; KEYB US 437 C:\DOS\KEYBOARD.SYS MOUSE VER PROMPT $P$G [CONFIG.MIN] COUNTRY=001,437 C:\DOS\COUNTRY.SYS FILES=20 BUFFERS=20 SHELL C:\COMMAND.COM /E:512 /P/F The following AUTOEXEC.TIL and CONFIG.TIL files should also be modified to fit your system: [AUTOEXEC.TIL] ECHO OFF REM: AUTOEXEC.TIL FOR TILIA PATH C:\;C:\DOS;C:\BIN;C:\TILIA; KEYB US 437 C:\DOS\KEYBOARD.SYS MOUSE VER PROMPT $P$G SET CGIPATH=C:\TILIA\DRIVERS SET KERNEL=C:\TILIA SET CURSORMODE=TRUE C:\TILIA\DRIVERS [CONFIG.TIL] COUNTRY=001,437 C:\DOS\COUNTRY.SYS FILES=25 BUFFERS=20 DEVICE=C:\DOS\VDISK.SYS 350 512 128/E SHELL C:\COMMAND.COM /E:512 /P/F Then COPY your usual AUTOEXEC.BAT file to AUTOEXEC.REG and your usual CONFIG.SYS file to CONFIG.REG and put these .REG files with your DOS files. The file which reboots your computer can be made from the following ASCII script. Type the following eleven lines very carefully (*NOTE: line 6 should be blank), and save it as an ASCII text file in your default direc- tory with the name WARMBOOT.TXT. a 100 mov ax,40 mov ds,ax mov word ptr [72], 1234 jmp ffff:0 [* leave line blank] r cx 10 n warmboot.com w q After the WARMBOOT.TXT is made and saved, type the following line from DOS: DEBUG < WARMBOOT.TXT and press "Enter." DEBUG.COM (one of your DOS files) then automatically reads the WARMBOOT.TXT file and makes a 16-byte file called WARMBOOT.COM. WARMBOOT.COM should be copied to the directory with your DOS files. When you want to use TILIA, from the DOS prompt, type TILSYS and press "enter." Your computer will exchange its AUTOEXEC.BAT and CONFIG.SYS files for those suitable for TILIA, and it will reboot itself coming up with the new configuration. The switch will be permanent (even after the computer is shut down) until you type REGSYS or MINSYS to change to another configuration. References Manninger, A. 1990. The Quick-Switch Trick (Reprinted from Vancouver PC Users Society Newsletter, April 1989), Bits & PC's, Madison PC User's Group, v. 9, n. 10 (October), p. 13. Janda, M. 1990. Boot Program, Bits & PC's, Madison PC User's Group, v. 9, n. 10 (October), p. 13. [*p.5 / p.6*] NEW BOOKSHELF 2 H. J. B. Birks The following recently published books may be of interest to readers of this Newsletter. Atkin, M., Anderson, D., Francis, B., and Hinde, J. 1989. Statistical modelling in GLIM. Clarendon Press, Oxford, 374 pp. (paperback). Anderson, A. J. B. 1989. Interpret- ing data. Chapman and Hall, London, 223 pp. (paperback). Battarbee, R. W., Mason, J., Renberg, I., and Talling, J. F. (eds.) 1990. Palaeolimnology and lake acidifi- cation. Royal Society, London, 219 pp. Chatterjee, S. and Hadi, A. S. 1988. Sensitivity analysis in linear re- gression. Wiley, New York, 315 pp. Crowder, M. J. and Hand, D. J. 1990. Analysis of repeated measures. Chapman and Hall, London, 257 pp. Diggle, P. J. 1990. Time Series. A biostatistical introduction. Clarendon Press, Oxford, 257 pp. (paperback). Dobson, A. J. 1990. An introduction to generalized linear models. Chap- man and Hall, London, 174 pp. (paperback). Dunteman, G. H. 1989. Principal com- ponents analysis. Sage Publications, Newbury Park, 96 pp. (paperback). Fox, J. and Long, J. S. (eds.) 1990. Modern methods of data analysis. Sage Publications, Newbury Park, 446 pp. Gifi, A. 1990. Nonlinear multi- variate analysis. Wiley, Chi- chester, 579 pp. Griffith, D. A. 1987. Spatial auto- correlation: A Primer. Resources Publications in Geography, Associa- tion of American Geographers, Wash- ington, D.C., 86 pp. (paperback). Griffith, D. A. 1988. Advanced spatial statistics. Special topics in the exploration of quantitative spatial data series. Kluwer, Dordrecht, 273 pp. Griffith, D. A. 1989. Spatial regression analysis on the PC. Institute of Mathematical Geography Discussion Paper 1, 84 pp. (paperback). Hengeveld, R. 1990. Dynamic bio- geography. Cambridge University Press, Cambridge, 249 pp. Hosmer, D. W. and Lemeshow, S. 1989. Applied logistic regression. Wiley, Chichester, 307 pp. Hughes, N. F. 1989. Fossils as information. New recording and stratal correlation techniques. Cambridge University Press, Cambridge, 136 pp. Jaccard, J., Turrisi, R., and Wan, C. K. 1990. Interaction effects in multiple regression. Sage Publica- tions, Newbury Park, 95 pp. (paperback). Jain, A. K. and Dubes, R. C. 1988. Algorithms for clustering data. Prentice Hall, New Jersey, 320 pp. Kaufman, L. and Rousseeuw, P. J. 1990. Finding groups in data. Wiley, Chichester, 342 pp. [*p.6 / p.7*] Lindsey, J. K. 1989. The analyses of categorical data using GLIM. Springer-Verlag, New York, 168 pp. (paperback). McCullagh, P. and Nelder, J. A. 1989. Generalized linear models (Second edition). Chapman and Hall, London, 511 pp. Madansky, A. 1988. Prescriptions for working statisticians. Springer- Verlag, New York, 295 pp. Mohr, L. B. 1990. Understanding significance testing. Sage Pub- lications, Newsbury Park, 76 pp. (paperback). Nash, J. C. 1990. Compact numer- ical methods for computers. Linear algebra and function minimisation (Second edition). Adam Hilger, Bristol and New York, 278 pp. (paperback with computer diskette available on request for IBM PC's). Nitecki, M. H. and Hoffman, A. 1987. Neutral models in biology, Oxford University Press, Oxford, 166 pp. Noreen, E. W. 1989. Computer inten- sive methods for testing hypotheses. An introduction. Wiley, New York, 229 pp. (paperback). Ostrom, C. W., Jr. 1990. Time series analysis - Regression techniques (Second edition). Sage Publi- cations, Newsbury Park, 95 pp. (paperback). Seber, G. A. F. and Wild, C. J. 1989. Nonlinear regression. Wiley, Chichester, 768 pp. Simon, J. L. 1990. Resampling: probability and statistics a radically different way. University of Maryland, 185 pp. (paperback). Sprent, P. 1989. Applied nonpara- metric statistical methods. Chapman and Hall, London, 259 pp. (paperback). van Rijckervorsel, J. L. and de Leeuw, J. 1988. Component and correspondence analysis. Wiley, Chichester, 146 pp. Whittacker, J. 1990. Graphical methods in applied multivariate statistics. Wiley, Chichester, 448 pp. þkland, R. H. 1990. Vegetation ecol-ogy: theory, methods and applications with reference to Fennoscandia. Sommerfeltia Sup- plement 1, Oslo, 233 pp. USEFUL PC SOFTWARE 2 H.J.B. Birks CANOCO 3.1 This is a considerably up-dated and a very powerful ordination program written by Cajo ter Braak, Wageningen, The Netherlands. The new version includes forward selection of environmental variables, new permutation testing procedures, regression and ordination diagnostics, variance ad- justments for variables in principal components analysis, and much more. CANOCO is, without doubt, the finest and most useful ordination program available, allowing one to do principal components, redundancy, correspon- dence, detrended correspondence, canonical correspondence, detrended canonical correspondence, partial redundancy, partial canonical correspon- dence, and canonical variates analyses. In Europe it is available from Campus Software, Vadaring 29, 6702 EA Wageningen, The Netherlands and outside Europe from Microcomputer Power, 111 Clover Lane, New York 14850, USA (telephone (607) 272-2188). Additional utility programs available from these suppliers include CANOPLOT and CEDIT (Onno F.R. van Tongeren) and CANODRAW (Petr Smilauer). [*p.7 / p.8*] WACALIB 2.1 This program reconstructs environmental variables from fossil assemblages by weighted averaging and maximum likelihood regression and calibration. It formed the basis for all lake-water pH reconstructions from fossil diatom assemblages in the Surface Waters Acidification Programme, and has also been used to reconstruct lake salinities, Al, and dissolved organic carbon from algal assemblages, past mire chemistry and moisture from bryo- phyte assemblages, past sea-surface temperatures from marine foraminiferal assemblages, and past land-use and soil variables from pollen assemblages. See J.M. Line and H.J.B. Birks (1990) Journal of Paleolimnology, 3, 170-173 for further details. WACALIB requires data inputs in the same format as CANOCO. Use of the two programs thus allows computation of a wide range of weighted averaging regression, calibration, and ordinations procedures that are of potential value to quantitative palaeoecologists. Available for $50 (œ35) from H.J.B. Birks, Botanical Institute, University of Bergen, Allegt. 41, N-5007 Bergen, Norway. Documentation, relevant publications, test data sets, and software utilities are included. R-DOC/X This is a very useful software package for MS-DOS computers for converting text files between 32 different word-processing programs (e.g. Wordstar, Word, Wordperfect, Displaywrite, PCWRITE, etc.). Very easy to use, cheap, and extremely useful. Available for $149 from Quicksoft Inc., 219 First Ave. N #224, Seattle, Washington 98109, USA. C2D This is an interactive program for spatial autocorrelation analysis in two dimensions using Moran's I and Geary's statistics. It calculates directional correlograms and allows one to view and plot the results. Written by Geoffrey M. Jacquez, it considerably extends spatial autocor- relation analysis to allow for anisotropic spatial data. Available for $75 from Exeter Software, Building B, 100 North Country Road, Setauket, NY 11733, USA. AN ANNOTATED BIBLIOGRAPHY OF NUMERICAL METHODS IN QUATERNARY POLLEN ANALYSIS 1985-1989. H.J.B. BIRKS, HAZEL JUGGINS AND MAGNE S’TERSDAL This bibliography, mentioned in Newsletter 3, is now available either as a printed copy or as a Word Perfect 5.1 file on diskette (cost œ3, $5). The 76 page bibliography contains 660 entries grouped into 61 topics. Please contact H.J.B. Birks, Botanical Institute, University of Bergen, Allegt. 41, N-5007 Bergen, Norway (e-mail birks@cc.uib.no) for a copy. THOUGHTS ON PRACTICAL DATA EXCHANGE Louis J. Maher, Jr. There are many ways of exchanging data. Tables of numbers on paper are easily handed or mailed to a colleague and kept for a lifetime in a file folder. For many purposes paper records are still the easiest and most permanent means of storing and sharing data. FAX has recently become popular by moving information at near-light speed and letting the recipient supply the paper. But this Newsletter's readers are especially aware that there are not many "degrees of freedom" in a table of numbers on paper. A practical data file should record information about a topic, a site, or sites, so that the information is secure, but at the same time readily available for use. To be readily usable, the information should be in the form of a digital file using standard and well-proven formats and media. At present the files are normally stored on magnetic disks and archived on either magnetic or optical disks and tape. (Paper and microfilm are still useful for archival purposes because technologies change. Have you ever tried to find a machine to play back a spool of magnetic WIRE RECORDINGS from the 1950's?) Today, microcomputers and floppy magnetic disks give Quaternary scientists incredibly effective ways of sharing information. The developing European and North American pollen databases (see Newsletter 4) come to mind as examples involving the handling and storage of large masses of data. [*p.8 / p.9*] It is not my purpose here to deal with the management of large databases. Rather I would make some observations on means by which an individual can share pollen data files with others. The files should have a format that is accessible internationally and sharable to the widest number of potential users. On the basis of the number of installed units and reason- able cost, the IBM PC (and its many clones) would seem the standard to adopt for data exchanged by magnetic disk. Loyalists of other types of computer (Apple, etc.) likely are able to translate between the IBM protocol and their own. There are many word-processor and database programs to choose between, and we often develop strong feelings about which is best. There is no reason to demand conformity among individuals in how the data are handled in their own facilities. However some file formats offer more generality for suc- cessful transfer than others. Although there are specialized programs for converting one proprietary file format to another, practical consider- ations suggest that data in standard ASCII text format are the easiest to handle on an international basis. All major data-manipulating programs have an option for importing or exporting ASCII text. Data in ASCII text format also can be viewed easily, and the receiver gets immediate assurance that the file survived the trip. ASCII text generally can be converted to another individually-preferred style with a simple conversion program written with the BASIC interpreter supplied with most PC's. One may hear that ASCII text is "old technology," and that there are better techniques for efficient storage--like the waste of using zeros for unrecorded taxa, or wasting a whole byte to indicate a decimal "1", say, when the same byte could record 256 separate integers. Efficient utilization of space is undoubtedly important when the volume of data is truly huge and when storage media are being newly developed and especially expensive. But it becomes trivial when one considers that all the pollen data in a good-sized country can be kept on a few floppy disks. I once did a study on how the size of a file depends on the storage format; I give a summary here because I found it interesting. I used an array of pollen data from my Devils Lake Site. It consisted of 80 taxon categories over 134 stratigraphic samples; that makes up 10,720 items of data, though about a third were zeros. To that value we must add the two array dimensions, an alpha-string title, and 80 alpha strings for the taxon names. I stored these data in seven different formats to determine the file size required by each format; the results are shown in Fig. 2. Figure 2. Formats and File Size. (not reproduced) The shortest ASCII file was one in which the number quantity was converted to a string value with leading blank removed (needed for the sign in negative numbers) and terminated with hexadecimal 0D (carriage return). When the 28,600-byte file is examined with an editor, it is one thin line of numbers along the left margin of the screen--very easy for a machine to read, but impossible for a Human to comprehend. The next format was an ASCII array in which each of the 134 rows consisted of 80 numbers separated by a single blank space. There was not much improvement because most of the carriage returns were replaced by the spaces, and the long rows extended off the screen. The third file structure is what I call a "Concentrated Wisconsin" format. Here the array rows are broken with a carriage return after every ten numbers, and the numbers are each separated by a single space; this file format is easier to interpret when viewed on the screen--perhaps because of our ten fingers. The "Regular Wisconsin" format has two spaces between numbers; it is easier to read, but a bit more bulky. When the data are stored in Eric Grimm's TILIA file structure, it uses 48,100 bytes, partly because Grimm builds in space for additional information about the pollen taxa. Stored in Borland's PARADOX Database results in three files which total over 100,000 bytes. And if the data are saved as a Borland QUATTRO PRO spreadsheet, it is 126,500 bytes long. The various files can differ markedly in size; powerful [*p.9 / p.10*] programs pay a certain size penalty in the overhead it takes to provide that power. File size is probably less important than it was but a few years ago; disk storage is relatively cheap. In addition, there are some very excellent, inexpensive compression utilities (i.e. PKZIP: PKWare, Inc., 7545 North Port Washington Road, Glendale, WI 53217 USA. Version 1.1 is a shareware program available from many electronic bulletin boards.) that can drastically shrink the size of stored files, as is shown in the lower part of Fig. 2. For the immediate future, data files can be exchanged in the mail or in person using the standard 5¬- and 3«-inch floppy magnetic disks. For in- ternational distribution the low-density (360 Kb 5¬ inch and 720 Kb 3« inch) disks are compatible with the widest range of equipment. The higher- density (1.2 Mb 5¬- inch and 1.4 Mb 3«-inch) disks transport data more efficiently and can be used when both sender and receiver agree to their use. Computers equipped with inexpensive modems can exchange limited data sets by the commercial telephone network. When the files are large, disks sent by mail are much more cost effective. Quaternary workers would be well advised to make a concerted effort to use the various governmental e- mail networks (Bitnet, NSFnet, UseNet, etc.) for both the international exchange of data and ideas. The system is essentially free for many aca- demic and governmental users. It is incredibly fast, and the file comes in digital form that can be recorded on disk when the e-mail is read. Thus the receiver can store the data directly on his/her own equipment, and it is in a form than can be manipulated at will. Contrast this with a page of figures transmitted by FAX; the data are very difficult to process further. The "folded" structure (ten items per row followed by a carriage return) of my "Wisconsin" .RAW and .DAT files was in part devised for use with e-mail. They always fit into a normal-width screen, and a standard text editor can then strip away the address and other extraneous comments from the e-mail, leaving a usable data file. While ASCII files wider than 80 characters will easily travel by e-mail, there is always the possibility that extra- long lines may become fragmented during the process, and the recipient of a "trashed" file may spend hours trying to find why it does not work. An individual can enjoy many of the computational advantages of a large research center by utilizing readily available commercial programs (word processors and spreadsheets) and tieing them into specialized programs through simple translation utilities. I have just completed v. 1.15 of my POLFILE program. Like the earlier versions, it can read and write my .RAW file format which Grimm's TILIA program recognizes as a "Wisconsin" File. TILIA's binary files would require special handling on e-mail; the .RAW file simply travels as text. As Craig Chumbley mentions in his discussion of PALYPLOT in this newsletter, a TILIA file converted to a .RAW file format can be changed to a .DAT file which is used by PALYPLOT. POLFILE v. 1.15 can convert a .DAT file to one that can be read directly by Warren L. Kovach's Multivariate Statistics Package MVSP Plus (see Newsletter #4). POLFILE also can now change its .DAT file structure so that it too can be read into TILIA as a Wisconsin File. And the new POLFILE can convert either a .RAW file or a .DAT file so that they can be directly imported into LOTUS 1-2-3 or QUATTRO PRO. Some of the proprietary statistics packages should accept output from these ubiquitous spreadsheets. (QUATTRO PRO files can be read directly by the Borland PARADOX Database program which I understand will be used in both the European and the North American Pollen Databases that were mentioned in Newsletter #4.) POLFILE v. 1.15 is free for the asking. It is a useful bridge between some very powerful programs, and its ASCII format allows you to exchange your data by disk, or by e-mail. [ Email addresses (not reproduced) extend onto p. 12. ]