INQUA - COMMISSION FOR THE STUDY OF THE HOLOCENE Working Group on Data-Handling Methods Newsletter 6, July, 1991 NOTE FROM THE COORDINATOR The Newsletter is a little longer than usual this time because a number of articles came in which should have general interest. For this issue I have asked Andr‚ Lotter and Steve Juggins to describe some of the statistical and plotting programs used in Europe. I had been hoping the Macintosh users would tell us what they are up to, and so I was delighted when Ed Cushing agreed to discuss SARI, his Macintosh-based program for describing and identifying spores and pollen grains. Eric Grimm relates what has been happening to the North American Pollen Database. And because I have been hearing a number of familiar names associated with Aberystwyst lately, I asked the Institute of Earth Studies to describe their operation. John Birks has again provided his valuable summary of available books and software. I have been somewhat distressed to realize how palynology often seems to dominate these newsletters. So I will take the opportunity to discuss some of the options open to all the disciplines for sending pictures files, working spreadsheets, compiled computer programs, or ANY binary program by e- mail. This facility can truly allow INQUA Data-Handlers to share their ideas and tools. I ask the readers of the Newsletter to send me information on any of the data-handling techniques that you have used which could be helpful to others. I have the mail and e-mail list on disk, and ask you to check both for accu- racy. Send any corrections or suggestions to: Louis J. Maher, Jr. Department of Geology & Geophysics University of Wisconsin 1215 W. Dayton Street Madison, WI 53706 USA Phone: (608) 262-9595 FAX: (608) 262-0693 E-mail: maher@geology.wisc.edu SARI: A HyperCard STACK FOR DESCRIBING AND IDENTIFYING SPORES AND POLLEN GRAINS E. J. Cushing Dept. Ecology, Evolution & Behavior University of Minnesota 318 Church St. SE Minneapolis, MN 55455 CUSHING@UMNACVX SARI (Sporomorph Analysis, Retrieval, and Identification) was designed to facilitate the thorough morphological description of pollen and spores in a large, unfamiliar flora where published keys do not exist. Descriptions of both reference grains and unknown fossil grains are kept in the same database. A new unknown grain may be compared with the descriptions in the database by searching by key words for any number of characters in any combination or order. Descriptions are entered by clicking buttons that correspond to common characters of pollen grains and pteridophyte spores. The buttons for the most common characters (class, basic structure and sculpture, symmetry, and size classes) are on a single card (Fig. 1), with options to go to additional Description cards for more detailed and special characters. The description thus entered is recorded, in abbreviated and standard form, on a Morphology card for the grain being described (Fig. 2). The description can be edited on the card, and new information added, from the keyboard. The character states and their abbreviations used in the stack follow those defined by Iversen & Troels-Smith (1950), with additions and modifications by Cushing. An accompanying Help stack defines and illustrates the characters and abbre- viations. Each grain in the database is given a distinctive identification number, and the data for each grain in the database are displayed on three cards. The Morphology card describes the grain. A Name card gives information about the taxonomy and location of the grain, with space for references to literature, notes on comparison with other grains, and a definition of the pollen or spore type (the Name cards for reference and unknown grains differ somewhat). Morphology and Name cards are required for all grains in the database, but the information on them may be as complete or incomplete as desired. [*p.1 / p.2*] Fig. 1. The first card for describing grains. Unknown #0379 is described here as a new grain; it is tricolporate, tectate, psilate; has a medium exine index; is subspheroidal and in size class media (25 - 50 ęm). Fig. 2. The Morphology card for Unknown #0379, with the description completed. The third card, an Image card, is optional (Fig. 3). It provides space for graphic illustrations of the grain. These may be drawn directly on the screen with the graphic tools included in HyperCard, or they may be pasted in from scans of photographs or line drawings. The resolution of images so en- tered into the stack is limited by the resolution of the Macintosh screen (72 pixels per inch). An alternative now being planned is to display images acquired by video camera and stored on a hard disk (or ultimately a CD ROM). The images can be accessed quickly from the HyperCard stack in a similar way, but much higher resolution is achieved. The Macintosh must be fitted with a video card for this alternative. The database can be searched to identify new grains as they are encountered at the microscope. Distinctive characters of the new grain are entered by clicking buttons on the Description cards, and a Search button starts the search. All grains in the database that agree with the description are displayed, one at a time. If no match is made, the new grain may be assigned an identification number and added to the database as an unknown. Descriptions can be expanded, and new reference grains added at any time to distinguish among similar morphological types. The database thus grows in a natural way as it is used. [*p.2 / p.3*] Fig. 3. The Image card for Unknown #0379. At the left are equatorial and polar views; in the center are sketches of the LO-pattern. SARI requires a Macintosh with at least 1 MB memory running System 6.0.5 or later and HyperCard 2.0. A hard disk becomes necessary as the database ex- pands. With a basic knowledge of HyperTalk, the HyperCard scripting language, the user can modify the stack to taste. A copy of SARI on a 3.5" floppy disk may be obtained by sending a blank disk to me at the address given at the head of this article. Reference Iversen, Johs. and Troels-Smith, J. 1950. Pollenmorfologiske definitioner og typer. Danm. Geol. Unders. IV/3, 8, 54 p. + 16 Tables. COMPUTER CLASS AFTER AIX PALYNOLOGY CONGRESS L. J. Maher, Jr. A number of you will be attending the 8th International Palynology Congress that meets from September 6 - 12, 1992, in Aix-en-Provence, France. I suggested to the organizing committee that it would be a good opportunity to offer a short course on computer data-handling as well as some training in the use of TILIAłGRAPH, PALYPLOT, and other display programs. A myriad of scheduling conflicts makes it impossible to run such a course during the actual conference, but Raymonde Bonnefille has offered the use of her laboratory (Laboratoire de Geologie du Quaternaire, Centre Universitaire de Luminy, Marseille, France) and its IBM compatible computers early in the week following the conference (Monday, Tuesday, and Wednesday, September 14 - 16, 1992). Joel Guiot, Annie Vincens, and Lou Maher are in change of planning the workshop, and Eric Grimm has agreed to be present to instruct in the use of TILIA and TILIAłGRAPH. The class will be free, and those attending can arrange for lodging on the Luminy campus "for about 150 FF per day." The workshop committee will be handling the registration details. I need to get an idea of the interest for this course. If you want to attend the course or know someone who should attend, please get in touch with Lou Maher as soon as possible at the address listed on the front page of this Newsletter. The number of available computers will limit the enrollment to one or two dozen. If the interest exceeds that number, we will have to establish some sort of limiting mechanism. Potential students who could bring an IBM compatible portable (graphics >= EGA) should mention this in their original contact. [*p.3 / p.4*] POLPROF, TRAN AND ZONE: PROGRAMS FOR PLOTTING, EDITING AND ZONING POLLEN AND DIATOM DATA Andr‚ F. Lotter, EAWAG, Swiss Federal Institute for Water Resources and Water Pollution Control, CH-8600 Dbendorf, Switzerland (LOTTER@SGI.UNIBE.CH) Steve Juggins, Environmental Change Research Centre, Department of Geography, University College London, 26 Bedford Way, London WC1H 0AP, UK (UCFAMAR@UCL.AC.UK) After having read about two powerful graphics programs (TILIAłGRAPH and PALYPLOT) developed in North America we decided, as an exchange of infor- mation, to describe some of the programs we use in Europe. Many pollen labs in Austria (Innsbruck), Switzerland (Basel, Bern, Lausanne), and Germany (Hannover, Hemmenhofen), are using POLPROF, a FORTRAN 77 program for calculating and drawing pollen diagrams. The program was developed by Andreas Tranquillini (Tranquillini 1988) in Innsbruck on the initiative of Prof. Sigmar Bortenschlager in the late 1970s and originally ran on mainframes. However, it has undergone several updates in the last ten years and is now available as a PC version running under DOS. It allows saw-edged pollen curves, expressed as percentages, concentration or influx values on a depth or age axis, to be viewed on the screen or output to a dot matrix or laser printer or a pen plotter. Some special features such as a "main pollen diagram" with (Iversen style) symbol curves for the major tree taxa and the AP/NAP line, and a lithology column with modified Troels-Smith (1955) symbols meet the requirements of many Central European pollen analysts (see Fig.1). The raw data (sediment composition, radiocarbon dates, depths, pollen counts, counts and concentration of marker spores, location of hiatuses etc.) can be entered with an ordinary text editor. We use a customized version of IBM's Personal Editor (PE-2) with the end of each line and all the necessary tabs preset. The taxa are entered by numerical codes which are related to a separate dictionary, with a maximum of 730 entries, making POLPROF also useful for diatomists who can define their own taxon dictionary. Within this dictionary taxa can be classified into one of up to 20 groups (e.g. AP, NAP, aquatics, spores, etc. or for diatoms acidobiontic, acidophilous etc.). Each of these groups can be in- or excluded from the basic calculation sum (=100%). If a "main diagram" is drawn, these groups are displayed as a cumulative diagram, summing up to 100%. Commands concerning the height of the plot (limited by the height of paper-- on our plotter 78 cm), the type of diagram (percentage, concentration, influx), depth or age scale, the minimum occurrence of a taxon for it to be drawn, can be entered interactively or stored in a batch file. The program also has facilities for specifying the order of pollen curves and the location and width of the lithology column, for excluding individual taxa (e.g. Cyperaceae) from the pollen sum, and for the text notation of rare taxa. Further details about the program and the conditions of use are available from Andreas Tranquillini (E-Mail: C102TA@AINUNI01.BITNET). TRAN and ZONE are C++ programs written by Steve Juggins for the editing, transformation and zonation of palaeoecological data. TRAN was written to make pollen data stored in "Tranquillini" or other "condensed" formats accessible to other programs for statistical analysis, or database input. It will read Tranquillini, Polldata, Tilia (ASCII), Cornell (condensed or full), or Paradox format files, and convert to Tilia, Cornell, Gordon (full format, by taxa) or Paradox format. The Paradox input and output forms a link to the Alpine Pollen Database which is at present in statu nascendi under the directorship of Prof. Brigitta Ammann in Bern, Switzerland. Simple editing allows the deletion of taxa by type (e.g. aquatics, spores, etc.), number of occurrences or ------------------- Caption to Figure 1 Pollen diagram from Soppensee core SO86-14 illustrating the late-glacial pollen succession including two hiatuses. Main pollen diagram: Pinus (filled circles), Betula (empty circles), Corylus (filled rhombus), Quercetum Mixtum (= sum of Quercus + Ulmus + Tilia + Acer + Fraxinus, filled square). Cyperaceae (wide hatched area) and Gramineae (narrow hatched area) are shown to the right of the AP/NAP separation line. ------------------- [*p.4 / p.5*] [ Pollen diagram not reproduced ] [*p.5 / p.6*] maximum abundance, the deletion of samples, and transformation to percentages or proportions. Condensed format files may also be printed as a full taxa by sample table or viewed spreadsheet style for data checking. TRAN therefore provides a convenient way to read a "raw" Tranquillini or Polldata file, delete aquatics, spores, etc., transform to percentages, delete additional rare taxa, and output in Cornell Condensed format for a PCA or CA using Cajo ter Braak's program CANOCO (ter Braak 1987). ZONE is based on the FORTRAN programs ZONATION and BARRIER, written by Alan Gordon, John Birks and John Line, and CONISS, written by Eric Grimm, and brings the following methods together in a single package for zoning palaeoecological data: CONSLINK - Constrained single link clustering, CONISS - Constrained incremental sum of squares clustering, SPLITLSQ and SPLITINF - Binary division using sum of squares and information statistic criteria, OPTIMAL PARTn. - Optimal partition using sum of squares criterion, BARRIER - Variable barriers approach (for details see: Gordon and Birks 1972; Birks & Gordon 1985; Grimm 1987). The program reads either Cornell or Gordon format files (usually after preprocessing by TRAN), and prints dendrograms on a line printer. TRAN and ZONE are easy-to-use, menu driven programs with on-line help. They both run on PCs and are available free, with a short manual and test data sets, from Steve Juggins at the above address (but please include disks - two low density or one high density). Birks, H.J.B. & Gordon, A.D. (1985) Numerical methods in Quaternary Pollen Analysis. Academic Press. Grimm, E. (1987) CONISS: A FORTRAN 77 program for stratigraphically con- strained cluster analysis by the methods of incremental sum of squares. Computers & Geoscience 13, 13-15. Gordon, A.D. & Birks, H.J.B. (1972) Numerical methods in Quaternary palaeoec- ology. I. Zonation of pollen diagrams. New Phytologist 71, 961-979. ter Braak, C.J.F. (1987) CANOCO - a FORTRAN program for canonical community ordination by (partial) (detrended) (canonical) correspondence analy- sis, principal components analysis and redundancy analysis (Version 2.1). TNO Institute of Applied Computer Science, Report 87 ITI A 11, 95 pp. Tranquillini, A. (1988) POLPROF ein Programm zum computergesteuerten Zeichnen von Pollenprofilen. Ber. nat.-med. Verein Innsbruck, Suppl. 2, 27-34. Troels-Smith, J. (1955) Characterization of unconsolidated sediments. Danm. Geol. Unders. IV/3, 10, 38-73. PALYNOLOGY AT ABERYSTWYTH Nestled among the remote valleys of coastal midWales is the town of Aberystwyth, the home of the new Institute of Earth Studies of the University of Wales. Within the Institute is the Palynological Research Centre (PRC), established in 1990 and staffed by Drs. David J. Batten, Warren L. Kovach, Henry F. Lamb, and Bruce A. Tocher. This centre was set up in the wake of a review of teaching and research in Earth Sciences at British Universities in order to establish, along with the existing micropalaeontology group, a centre of excellence in the study of plant and animal microfossils of all ages. The research activities of the palynologists are diverse and cover the whole of the Phanerozoic Era. David Batten's main interests are in Mesozoic palynology and palaeoenvironments, floral provinces and climate, palynofa- cies, organic maturation and petroleum source rocks. Warren Kovach studies the palaeoecology and systematics of Mesozoic plant megaspores as well as the application of numerical methods in palaeoenvironmental and biostratigraphic research and the use of computers in palaeontology. Henry Lamb works on Quaternary vegetation and climatic history, with recent studies focusing on lacustrine palaeoenvironments and the environmental history of the Arctic and North Africa. Bruce Tocher is investigating Mesozoic and Cainozoic dinoflagellate biostratigraphy and ecology, and is also concerned with palaeoenvironmental interpretations and palaeoceanographic [*p.6 / p.7*] modelling. In addition, Dr. Catherine Duigan, a post-doctoral researcher, is working on the palaeolimnology of lakes in the High Atlas Mountains of Morocco, as well as on the taxonomy and palaeoecology of diatoms and Clad- ocera. A new M.Sc. course in palynology was also initiated last year. This aims to give students a broad training in palynology, covering all ages and a range of topics from petroleum exploration-related biostratigraphy through Quaternary climatic studies and numerical methods to forensic applications. There are currently five students on the course, including two who will shortly begin working towards Ph.D. degrees. There are also two other students within the Institute who are incorporating palynological studies into their Ph.D. work on Quaternary sediments, and three external Ph.D. students (one from Copenhagen, Denmark, and two from Plymouth, England) who are working closely with members of the PRC staff. The PRC is housed in a purpose-built suite of offices and laboratories. This includes two large palynological preparation lab, supervised by Mrs. Lorraine Morrison. There are a number of IBM-PC compatible computers in the centre, linked by Ethernet to the University's DEC 5820 and VAX computers. We have recently installed hardware and software to turn some of these computers into stratigraphic workstations, where palynological data can be entered into a database through a 256 key touch pad, with the resulting diagrams being automatically drawn on a large format plotter. We may be contacted at the: Palynological Research Centre Institute of Earth Studies University College of Wales Aberystwyth, Dyfed, Wales SY23 3DB U.K. Fax: +44 (970) 622659 E-Mail & phone: D.J. Batten - DJB@ABER.AC.UK; (970) 622573 W.L. Kovach - WLK@ABER.AC.UK; (970) 622626 H.F. Lamb - HFL@ABER.AC.UK; (970) 622597 B.A. Tocher - BAT@ABER.AC.UK; (970) 622611 C. Duigan - CTD@ABER.AC.UK; (970) 622626 THE NORTH AMERICAN POLLEN DATABASE Eric C. Grimm Illinois State Museum Research & Collections Center 1920 South 10 1/2 Street Springfield, IL 62706, USA grimm@denr1.igis.uiuc.edu In recognition of the importance of fossil-pollen data for paleoclimatic research, the National Geophysical Data Center, under the auspices of the NOAA Climate and Global Change Program, is funding the North American Pollen Database. Eric C. Grimm at the Illinois State Museum is organizing the database, and a full-time database programmer (John Keltner) has been employed. The database also has an advisory board of seven palynologists from the United States and Canada (K. J. Gajewski, G. L. Jacobson Jr., G. MacDonald, L. J. Maher, V. Markgraf, T. Webb III, and C. Whitlock). General objectives of the Pollen Database are to create a relational database structure for pollen data, to incorporate pollen and associated data into the database, and to provide software for querying the database. The database is intended to be not only an archive for preserving pollen data, but, importantly, a valuable tool for paleoclimatic and paleoecological studies. The eventual goal of the database is to include all pollen data from North America since 1960, as well as useful earlier data, and to keep the database current with newly generated data. The database will be available to all scientists. Specific objectives of Year 1 are the creation of a relational database structure for fossil-pollen data, development of a hierarchical taxonomy, and incorporation of the public portion of the COHMAP pollen data into the database. We are now in the process of verifying the pollen data against the published record and adding missing data available in that record. The database will contain various kinds of information that the COHMAP database did not. Eventually all data will be returned to the data originators for validation and for the addition of site details not available in the published record. We have been working with the European Pollen Database, which is also in its first year of funding, to insure compatibility between the two databases. The result of this collaboration has been establishment of [*p.7 / p.8*] complete compatibility between the databases and development of a database structure that should accommodate pollen data worldwide. After close consultation, we have developed tables that accommodate the European pollen data, as well as the North American. Thus, we are using exactly the same table structure. The database is based on IBM compatible microcomputers. After an investigation of database software, we selected Paradox from Borland Interna- tional. The strengths of Paradox are power, adherence to the relational model, ease of use, application language support, and wide availability, often at heavily discounted prices. A major task has been the creation of a logical and flexible database structure that adheres to the relational model. The relational model of database management insures the flexibility of the database to meet future demands and developments in both hardware and software. Database structure refers to the design of tables that contain all of the relevant data. Good design implies simplicity of individual table structure and elimination of redundancy. The design must facilitate the addition of potential future data that have presently unforeseen characteristics, and it must also facilitate compatibility with other kinds of stratigraphic and paleoclimatic data. The database now consists of 49 tables, divided into three general categories: archival tables, look-up tables, and research tables. Archival tables contain actual data, including pollen counts, site locations and descriptions, geochronologic data, and publication citations. Look-up tables contain descriptions of codes and keys used in the archival tables; for exam- ple, names, pollen taxa, states, and countries corresponding to shorter codes used in the archival tables. Research tables consist of interpretive data and higher-level organization of data. Because the database is necessarily complex, now consisting of 49 tables with various key relationships, querying the database can be complex, especially for novice database users. It is therefore important to develop application programs for data entry, update, and retrieval; and it is essential to develop a layer of software between all users and the data to maintain the integrity of the data. These applications will be developed in cooperation with the European Pollen Database in Arles, France. NEW BOOKSHELF 3 H.J.B. Birks The following recently published books may be of interest to readers of this Newsletter. F.P. Agterberg 1990 Automated Stratigraphic Correlation. Elsevier, Amsterdam, 424 pp. U. Bayer 1985 Pattern recognition problems in Geology and Paleontology. Lecture Notes in Earth Sciences 2, SpringerVerlag, Berlin, 229 pp. N.G. Becker 1989 Analysis of infectious disease data. Chapman and Hall, London, 224 pp. S.S. Bell, E.D. McCoy, and H.R. Mushinsky 1990 Habitat Structure - the physical arrangement of objects in space. Chapman and Hall, London, 438 pp. T.J. Hastie and R.J. Tibshirani 1990 Generalized additive models. Chapman and Hall, London 335 pp. E.H. Isaaks and R.M. Srivastava 1989 An introduction to applied geostatistics. Oxford University Press, New York, 561 pp. (paperback) P.A.W. Lewis and E.J. Orav 1989 Simulation methodology for statisticians, operation analysts, and engineers 1. Wadsworth & Brooks / Cole Advanced Books, Pacific Grove, California. B.J.F. Manly 1991 Randomization and Monte Carlo Methods in Biology. Chapman and Hall, London, 281 pp. H. Martens and T. N‘s 1989 Multivariate calibration. Wiley, Chichester, 419 pp. A.J. Miller 1990 Subset selection in regression. Chapman and Hall, London, 229 pp. F.J. Rohlf and F.L. Bookstein 1990 Proceedings of the Michigan Morphometrics Workshop. University of Michigan Museum of Zoology, Ann Arbor Special Publication 2, 390 pp + 9 diskettes. [*p.8 / p.9*] G.S. Ross 1990 Nonlinear estimation. Springer- Verlag, Berlin, 189 pp. J.H. Tallis 1990 Plant Community History - Long-term changes in plant distribution and diversity. Chapman and Hall, London, 398 pp. R. Webster and M.A. Oliver 1990 Statistical methods in soil and land resource survey. Oxford University Press, Oxford, 316 pp. (paperback) S.C. Weller and A.K. Romney 1990 Metric scaling - correspondence analysis. Sage Publications, Newbury Park, 96 pp. (paperback) Readers may find the review of 12 Technical Graphics Packages in PCMagazine 10, number 6 (March 26 1991) interesting, with reviews of graphical packages such as AXUM, Graftool, Sigma Plot, Grapher, Surfer, etc. USEFUL PC SOFTWARE 3 H.J.B. Birks  CEDIT  Many programs require as input so-called Cornell condensed format data files (e.g. TWINSPAN, DECORANA, CANOCO, WACALIB). The editing, appending, merging, sorting, transforming, and other manipulations of condensed data-files can be difficult, especially with large data matrices. CEDIT, written by Onno van Tongeren (Limnological Institute, Rijksstraatweg 6, 3631 AC Nieuwersluis, The Netherlands) makes the editing and manipulations of condensed data matrices easy and painless. CEDIT can also be used to input data into a spreadsheet and to output the data in condensed format, as well as 101 other manipu- lations of condensed files. If you use condensed format files much, CEDIT is an invaluable and indispensable aid. It is a must for anyone regularly using programs such as DECORANA, TWINSPAN, or CANOCO.  CANODRAW  Graphical display of ordination results is an essential step in the inter- pretation of the results. CANODRAW, written by Petr Smilauer, Section of Plant Ecology, Botanical Institute of Czechoslovak Academy of Sciences, Dukeska 145, Trebon, Czechoslovakia 379 82, is a very powerful graphics program designed specifically to plot results from Cajo ter Braak's fantastic CANOCO 3.10 ordination program. CANODRAW produces dot-matrix-printer, laser- printer, or graph-potter hard copies of all types of scatter plots, biplots, triplots, and joint plots of ordination results, it will plot the abundances of individual variables in different samples in ordination space, the diversity, richness, and evenness of each sample, the relative values of groups of selected variables in each sample, etc. It is, in many ways the TILIAłGRAPH of the CANOCO-ordination world! It is an invaluable adjunct to anyone using CANOCO 3.10 or later. Details about its availability can be obtained from Campus Software, Vadaring 29, 6702 EA Wageningen, The Netherlands and outside Europe from Microcomputer Power, 111 Clover Lane, New York 14850, USA (telephone 607 272-2188).  NTSYS-pc Version 1.60  This is a substantial upgrade of F.J. Rohlf's NTSYSpc 1.50 for multivariate data analysis (see Newsletter 3, January 1990). The new version is even more user friendly with pull-down menus, on-line help, there is automatic capture of screen output, the program automatically uses expanded memory, extended memory, and even disc paging, and the graphical procedures have been improved. NTSYS remains one of the most versatile and friendly PC packages available for much multivariate analysis, particularly for topics in numerical taxonomy, classification, and scaling. It costs $175 ($140 for educational institutions) and $75 for an upgrade. It is available from Exeter Software, 100 North Country Road, Bldg. B, Setauket, New York 11733, USA (telephone 800 842-5892, 516 689-7838). [*p.9 / p.10*]  GS+  This program provides biologists and ecologists with an easy-to-use introduc- tion to geostatistical techniques such as kriging, autocorrelation, semivariance analysis, fractal dimensional analysis, and mapping. It produces output for Epson printers, HP Laserjet, and IBM Proprinters. It is extremely fast and easy to run, and is an excellent introduction to the field of spatial analysis and geostatistics. It costs $275 and is available from Gamma Design Software, Box 201, 457 East Bridge Street, Plainwell, Michigan 49080, USA (telephone 616 685-9011). SENDING BINARY FILES BY E-MAIL L. J. Maher, Jr. Sending text material by e-mail using Bitnet or Internet is quick, inexpensive, useful, and fun. But often you would like to be able to send a condensed format data file, a picture, a functioning spreadsheet, or a compiled computer program. E-mail will not normally handle binary files because many of the byte values in a binary file have no ASCII text equiva- lents--or worse, are used to control the computers trying to handle the mail! Given the necessary privileges and passwords, those on Internet can log on to the remote facility using File Transfer Protocol by typing: ftp [facility name|number]. Once logged on the remote computer with the "ftp>" prompt showing, simply type "binary" to set the program so that an exact "image" of the file will be moved. Then typing "put [filename]" will transfer your file to the foreign computer; typing "get [filename]" will get the file from the remote computer. Do not forget to specify "binary" before using put or get. If you are on Internet, you might want to "man ftp" to read the screen manual on how it works; it is extremely fast, but, alas, it is much more restrictive in its requirements than normal e-mail. When I was going to send a disk with a compiled program to Warren Kovach (Aberystwyth), he mentioned the unix programs "uuencode" and "uudecode," and suggested I try to encode the file and send it e-mail as well. I have learned enough about the unix system to realize one can spend a lifetime learning all its procedures, so I tend to use what I need and ignore the rest. But I should not have ignored unix-to-unix encode and decode! Because it is done on the unix system, it is available to anyone using a micro- computer as a terminal; it can be used to move any file whether you use an IBM, an Apple, or what-have-you. (Of course, if you have a DOS machine, an Apple program will not do you much good unless you also have an Apple available.) One can use uuencode without knowing anything about how it works, but it is useful to know a little about it to see its implications. It encodes the binary file by concatenating three 8-bit bytes into 24 bits. Then the 24 bits are broken into four units of six bits each. Now six bits can express decimal numbers from 0 to 63. If 32 (the ASCII value for a space) is added to each of those numbers, the possible sums can range from 32 through 95, which happen to be the decimal values for the ASCII sequence: !"#$%&'()*+,-./0123456789:;<=>? @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ (The first item in the code list above is ASCII character 32, a space, which does not print. Because some e-mail systems ignore multiple spaces for economy, uuencode substitutes ASCII 96 whenever it generates a 32; ASCII 96 prints as `, a reversed apostrophe; uudecode treats the ` as 32). All of these characters fall in the range used by normal e-mail, so the file created by uuencode can be simply e-mailed like any text message. The person at the other end gets the original binary file by typing: "uudecode [filename]". I give this background to suggest that the uuencoded files are going to seem like a random assemblage from the above character list, and look like equal- length lines of gibberish. And the file will be longer than the original. The encoded file is expanded by 35% (3 bytes become 4 plus control information), and it takes longer to transmit. But it will travel by e-mail, and that is the point. Let me summarize the procedures for sending a binary file from your microcomputer to a colleague in another country. I tie into my department's unix computer either by an unshielded two-strand 9600-baud Ethernet cable from the office or by a 2400-baud modem by telephone. The communications [*p.10 / p.11*] program on my IBM microcomputer is an old shareware version of PROCOMM. Your entry to the email network will likely be similar although it will undoubtedly differ in detail. Generic CADD5, which I purchased to use Chumbley's PALYPLOT, is a nice program, and I used it to draft plans for a "Russian sampler" from photographs I had taken in Sweden. If one wishes to share the plans with a colleague who has Generic CADD5, then it is better to send the actual CADD5 RUSSIAN.DWG file rather than a paper copy because the recipient can alter the plans if he/she wishes and print a hard copy to any desired scale. I was using a non-standard font in the drawing, so that should be sent to the user as well. It is convenient and more efficient to bundle the two binary files together and compress them. I use the shareware "ZIP" utility from PKWare which is available on most computer bulletin boards. The DOS command: PKZIP /EX RUSSIAN RUSSIAN.DWG NEW.FNT compresses as much as possible the two files RUSSIAN.DWG and NEW.FNT into the file RUSSIAN.ZIP. Using another PKWare utility ZIP2EXE.EXE with the command: ZIP2EXE RUSSIAN results in a new file RUSSIAN.EXE which is self-extracting; the file will decompress itself into its component files by typing its name. You now have the binary file in your own computer. To e-mail it to a friend, you must get it to the network computer. Log on to the unix system and move to the directory from which you send mail. In my normal e-mail messages, I use kermit or ckermit to up-load text files written on my DOS editor. When up-loading binary files, I find the unix "xmodem" more fool-proof. Xmodem has four switches: s, r, t, and b which stand for send, receive, text, and binary. Because I am logged on the unix system and I want to receive a binary file from my IBM, the command is: xmodem -rb russian.exe and that copies my russian.exe file to a unix binary file of the same name. It will not work on the unix system because it is a DOS file, but it is an exact copy of my DOS file. To make it suitable for e-mail I must use unix's uuencode: uuencode russian.exe russian.exe > russian.uue This instructs uuencode to 1) take the file russian.exe, 2) encode it to a file that will uudecode to a file called russian.exe, and 3) put the encoded binary file into a text file called russian.uue. Now one can simply type: mail [e-mail address] < russian.uue and, with luck, the colleague will get the encoded message. It is a good idea to send your colleague a message saying the encoded binary file is following as a second message. Then she/he can put the first note into the "mbox" or delete it, whereas the important mail can be saved in its own file. Assuming the mail item number 2 is the important one: s 2 russian.uue will save the second mail item as a text file named russian.uue. Then the friend can recover the encoded binary file by typing: uudecode russian.uue whereupon russian.exe will be found in the same directory. The recipient then down-loads the file to his/her DOS computer with xmodem by typing: xmodem -sb russian.exe followed by the communications program's down-load and xmodem commands, and the file is delivered. The recipient then logs out, and from the DOS command line, types RUSSIAN. If all is well, the file will self-extract to yield RUSSIAN.DWG and NEW.FNT which can be loaded into CADD5 to be printed or further edited. I sent an encoded RUSSIAN.DWG file to my son at the University of Nebraska. He returned the file by e-mail, and it was then reconstituted. A hard copy that was then reproduced page size is shown in Fig. 1. You would not want to give such a small drawing to a machinist, but having the CADD5 file allows you to print all or any part of the plans to full scale or to modify it for your own needs. [*p.11 / p.12*] (Fig 1 at top of page 12 not reproduced here.] Some e-mail addresses limit messages to 64 K bytes, and binary files may easily exceed the limit. There are two ways around this problem. Because the uuencoded file is text, you can load the file into vi, the unix text editor, and cut it between lines into two or more pieces. Save each of the pieces with a short name (part1, part2, etc.), and mail the pieces as separate messages. The recipient can use vi to paste them together in correct sequence before uudecoding. Perhaps a more satisfactory procedure for DOS users is to get a copy of the program by Richard Marks, 931 Sulgrave Lane, Bryn Mawr. PA 19010. Warren Kovach e-mailed me an encoded copy of this fine freeware program, and I found later that it is available at many unix facilities. The package comes in the self-extracting file UUEXE402.EXE which generates UUENCODE.EXE, UUDECODE.EXE, AND UUDECODE.DOC. The UUDECODE.DOC file provides background and useful instructions. The program allows you to UUENCODE and UUDECODE binary files in your own DOS computer. It also automatically breaks the file into individual numbered pieces if it exceeds 64 K bytes. You can then move the files to unix as text (xmodem -rt), and e-mail the segments. Take, for example, a 150 K-byte file named BIG.EXE. The DOS command would be: UUENCODE BIG.EXE This would automatically result in the three encoded text files: BIG1.UUE, BIG2.UUE, and BIG3.UUE. The recipient would use his copy of Marks' program to decode them with the command: UUDECODE BIG [*p.12 / p.13*] The numbered pieces would be read by UUDECODE (it can tell when it reaches the last piece), and BIG.EXE would appear in the directory. If you cannot find a copy of UUEXE402.EXE locally, I would be happy to e-mail you an UUENCODED version. You will have to use the unix uudecode program the first time to restore it. Then send (xmodem -sb) the resulting binary file to your machine. Once you have it, you can do encoding and decoding in the privacy of your own DOS machine! FIRST AID FOR TILIA AND PALYPLOT USERS by Dr. Triage The Newsletter Coordinator has heard that users of TILIA and TILIAłGRAPH (Grimm, see Newsletter 4, July 1990) and PALYPLOT (Chumbley, see Newsletter 5, January 1991) feel lost when first trying to print their pollen diagrams. Sometimes it is easier to ask questions when you are not dealing with the expert who wrote the code. Therefore, if you mail or e-mail questions to Lou Maher (address on page 1), he will refer them to me for a quick answer. If I do not know the answer, I will refer the problem to Grimm or Chumbley for a definitive response. A few questions thought to be of general interest to the palynologist readers will appear in this column in future issues. But be warned, if you do not send your questions to Lou Maher, then he said he would "chop" my column without a moment's regret.  TILIA and TILIAłGRAPH  Dr. Triage: The TILIA spreadsheet capacity is 100 samples (columns) x 250 taxon variables (rows). I NEVER have that many taxa. How can I trade taxa for samples? HELP! Dear Help: That is Eric's way of shaming us folks who lump taxa. The solution is to select [I] Options from the main menu and then pick [B] Spreadsheet size from the next menu. You will be asked if you really want to change, and that if you do change dimensions you will lose your data. Say "yes" because at this stage you will not have loaded any data. You can then change the number of rows and columns. I use 90 rows and 135 columns because I am a lumper. You will be asked if you want to save the new settings. Say yes, and you will never see those 250 taxa again! D.T. Dr. Triage: It would be nice to be able to access *.TIL files in directories other than the \TILIA directory. Can that be done? A floppy disk Addict. Dear Addict: Check the [I] Options category on the main menu. The [A] Working directory category of the next menu allows you to change directories. If the correct path and directory are shown, simply use the key to back out. If you hit alone, the directory you presently are in will be selected. Or you can call for a new disk and/or path. You will be asked if you want to SAVE the new path. If you say YES, it will then become the permanent one UNTIL you change it again. If you say NO, you will be put in the new directory, but it will not be remembered the next time you run TILIA. I generally use the NO choice if I want simply to deal with another directory temporarily. You can change to a directory to read a file, and after you work with it, you can change the working directory again to save it somewhere else. As you have probably found out, when you are asked for a *.TIL file name, if you simply press ENTER, you will see the *.TIL files in the directory you have chosen. So far this only works with a *.TIL file; you should jot down the exact names of data and dictionary files that you may need to load because you will not be able to call for a directory listing of non-*.TIL files. And if you think you can get around this restriction by calling up a "terminate-and-stay-resident" program like PCTOOLS to "go external" and use DOS commands, I would be very careful. Some TSR and cache programs can cause TILIA to trip up, act in unexpected ways, and sometimes DO BAD THINGS to your data. See Maher (Newsletter 5, January 1991, p.4) about setting up a separate environment when using TILIA. And by the way, if your computer locks up when you call for TILIAłGRAPH, you probably forgot that while TILIA will run without the GSS-CGI drivers being loaded by AUTOEXEC.BAT, TILIAłGRAPH will not. D.T. Dr. Triage: The two-letter taxon codes used by TILIA which appear along the left side of its spreadsheet are too short to be meaningful. How can I remember what they mean? Forgetful. Dear FO: I NEVER pay any attention to the two-character codes. Just think of them as a SHORT label at the left of the spreadsheet. One can always see what they are, because if I am in row FO, I can see on the LOWER LEFT CORNER [*p.13 / p.14*] of my screen that the taxon name is Forgetful. Did you know that you can move easily around the margins of the spreadsheet by pressing the key? A little square with four arrows will appear. Pressing the desired arrow key will move you to that margin. D.T. Dr. Triage: When I run TILIAłGRAPH, "make" the diagram and "view" it, I often find I need to do some additional editing. After I make the changes and view the diagram again on the screen, nothing has been changed. Why? Secondly, when I "view" the diagram on the screen, it is often too small to see fine details; I notice these too late after I have spent time printing the diagram. Worried. Dear Worried: You must press [A] Make Diagram on the menu again after you do any editing. When you press [B] View Diagram and it starts to appear on the screen, touch the <+> key, and the diagram will be drawn larger. You can repeat the key after a moment and it will plot larger still. You can also touch the arrow keys to "move" around the diagram as it plots. Although you will now see the small details, you will no longer see the whole diagram; it is that "forest and trees" thing. After you check the details, press the <-> key one or more times until the whole diagram fits the screen again. D.T.  PALYPLOT  Dr. Triage: When PALYPLOT makes the batch file that I load into CADD5 with the Load Batch (LB) command, I sometimes find that there are small things I would like to edit; for example carbon dates too close together will be overwritten. When I try to Window Erase (WE), I find it difficult to erase the details without removing some nearby item that I wanted to keep. Is there a way? Fastidious. Dear Fastid.: Be sure you know about CADD5's UE, UU and OO commands to UnErase, Redo or UNDO what you just did! When you are working in tight places, use the "Layers" menu, and you will see PALYPLOT puts each taxon, title, y-axis label, etc. on a different layer. You can turn off the layers of the drawing near where corrections are being made, and they will not be affected. CADD5 has a very handy feature which allows you to MatcH existing styles. Type MH and you will be told to point at the item you wish to match. Put the cursor on the text, and press . You will then be able to Place Text (TP) letters that will match the original's size, font, line width, and angle on the page! I write the new text near the original, Window Erase (WE) the original, and Window Move (WM) the new version where the original was. Then turn all the layers back on. D. T. Dr. Triage: I am hopping mad! I tried to plot an influx diagram with PALYPLOT. I find it uses my carbon dates literally and gets the necessary sedimentation rate information by interpolating linearly between the dates. I have a lot of closely-spaced dates, and the laws of chance mean some younger dates will occur under older dates. PALYPLOT calculates negative sedimentation rates for these intervals, and you can image what this does to the plot! Worse yet, the program automatically labels the diagram "Pollen Accumulation Rates" when any reasonable person would say "Pollen Influx!" Dear Hopping: Relax. If you have a pet curve or function that you want to use to calculate a non-linear and positive sedimentation rate (centimeters per year) for each sample, then you can easily trick PALYPLOT to do your dirty work. PALYPLOT's *.PDT file for figuring Pollen Concentration and Influx has Sediment Volume, Marker Grains Added and Marker Grains Counted as the first three "taxa." Load the file into a text editor, and for the numbers in the Marker Grains Added category, multiply each value by the cm/yr you feel represents the sedimentation rate for that sample interval. Change the category title to "Markers Added times cm/yr" to remind yourself of your skulduggery. If you want the y-axis to plot its scale in your version of years, you might also replace the Depth (Cm) data with each sample's estimated age, and rename the category Years BP. (Note: Do not tell PALYPLOT you are doing this in its *.CTL file. Leave YUNITS = DEPTH rather than changing to AGE. If you use AGE, it will try to make linear fits to the carbon dates.) Save the file as ASCII text. Load it into PALYPLOT and tell it you want to make a POLLEN CONCENTRATION plot. It will not know you are lying to it. After it plots out the batch file and you get it into CADD5, use the editing tricks mentioned above to change the label "Grains/cm3" to "Grains/cm2/yr" in two places: along the base of the x-axis and in the label for the "TOTAL POLLEN IN THE SUM" curve. Finally, take the opportunity to replace the subtitle "Pollen concentration" - not with "Pollen Accumulation Rate" - but with the preferred "Pollen Influx." D. T. [*p.14 / p.15*] [ Email addresses (not reproduced) extend onto p. 16. ]