INQUA Sub-Commission on Data-Handling Methods

Newsletter 17: February 1998

PROGRAMS FOR SITE SELECTION, TABULAR DISPLAY, AND INTERPOLATION OF DATA FROM PARADOX-BASED POLLEN DATABASES

Leduc, Phillip L., Williams, John W., and Webb III, Thompson

Department of Geological Sciences
324 Brook Street
Brown University
Providence, RI 02912

phillip_leduc@brown.edu
jww@brown.edu
thompson_webb_iii@brown.edu

The North American Pollen Database (NAPD) and other pollen databases make pollen data readily available over the Internet, where they can be accessed at such places as the World Data Center A for Paleoclimate at the National Geophysical Data Center run by NOAA (the National Ocean and Atmosphere Administration) in Boulder CO. These databases provide pollen counts, radiocarbon dates, latitudes, longitudes and other information from hundreds of sites with pollen data in North America, Latin America, Europe, and the globe. Accessibility to the data is excellent.

These relational databases are continually maintained and updated and offer data that are stored in standardized tables. These databases are relational because the tables are linked by key fields, such as entity numbers that are unique for each site or variable numbers that are unique for each name of a pollen taxon. The tables can be downloaded as Paradox, Access or ASCII files (check the NOAA web site for more information). Borland Paradox and Microsoft Access are relational database management packages that allow the examination and manipulation of data within these database files but only if the user understands the relationships or links between tables. So, the problem is how to make use of the data in the many linked tables once they are downloaded and available on a personal computer or server. At Brown University, we have worked extensively with data from the NAPD in Paradox tables and have developed a tool kit, called Pollen Analysis Tools (PATools) to aid the display and analysis of the data. Because much of our research centers on the mapping of the pollen data, the focus of PATools is to permit data analysis from many sites in a region or continent.

PATools is a collection of Paradox 7 programs that eliminates the need to analyze the NAPD data structure and enables analysts to extract information from downloaded files through a graphic user interface of pulldown menus (see figure in black-and-white or colour and Table 1). Selections from the menus lead to other interfaces with sets of pushbuttons for selecting a pollen sum and the pollen taxa to display or for performing other analytical tasks with the data. The names of the pulldown options are listed in italics and their functions briefly described in the following paragraphs. Although PATools has been developed for use with the NAPD, it can be and has been adapted to other pollen databases of similar structure.


Table              Select            Percent         Chron         Analog

Compare Tables     Sum Components    Verify Site     View          Find Analogs
                                     Entities        Chronologies

Convert Degrees    Display Pollen    Pcounts and     Test Chronologies
                                     Percents

Number Records     Display Sites     Interpolate     Rank All Chronologies
                                     Percents

Number Samples                       Build Percent   Convert Ages
                                     Report


Table 1. Pulldown Menus and Selections in PATools (Graphic version)


PATools was developed to allow for calculation of pollen percentages from raw counts from one or multiple sites and generation of percentage reports (Pcounts and Percents, Build Percent Report), for the checking and potential modification of the chronologies used at each site (View Chronologies, Test Chronologies), for the interpolation of pollen percentages to selected dates for mapping (Interpolate Percents), and for the calculation of dissimilarity values between selected pollen spectra to identify closely similar samples and analogs (Find Analogs). The main interface in PATools allows a user to choose the pollen taxa, pollen sum and sites for study (Display Pollen, Sum Components, Display Sites). Sites can be selected from the master site list (Figure 1), or may be selected by latitude and longitude, political division, age, and/or type of record (core vs. section vs. surface sample). When these selections have been made, the user can check which sites have all the data needed to generate complete percentage reports and find out what is missing at the sites where information is incomplete (Verify Site Entities). For those sites with complete sets of tables, PATools will run Pcounts and Percents to generate the percentage reports (Table 2), which list for each site the depth of each sample, its estimated radiocarbon and calendar year age (Convert Ages generates calendar year ages back to 50,000 radiocarbon years by standard algorithms), its total pollen count for the selected pollen sum, and the pollen percentages for selected pollen taxa. As an example, we have used an abbreviated taxon list to display data from Paradise Lake in Labrador (Table 2) and its uppermost 15 samples. Following this table are two other tables listing the chronological information used to construct the age model at this site (Tables 3-4). These show the interpolation method and dates used, and can be checked using Test Chronologies and View Chronologies. These three tables (Tables 2-4) are presented in the standardized percentage reports produced by Build Percent Report. If the pollen percentages and the associated chronology pass inspection, then an Interpolated Percentage Table (Table 5) can be produced (Interpolate Percents) that contains interpolated percentages for radiocarbon (or, if selected, calendar) year intervals at selected dates, e.g. 1000-year intervals, for mapping.


Depth  AgeBP  CalBP  Sum   Picea  Abies  Pinus  Betula  Other  Other Trees
cm.    yr.    yr.                                       Herbs  & Shrubs

205      0       0   540   85.74   0.93   0.37  10.00    0.19   2.78
225    778     887   569   86.29   0.70   0.53   8.26    0.88   3.34
252   1829    2085   325   84.00   2.46   0.62  10.77    0.00   2.15
272   2608    2973   371   78.44   2.16   0.54  15.09    0.54   3.23
292   3386    3860   352   58.24   2.84   0.28  30.68    0.57   7.39
302   3776    4304   316   16.14   6.96   1.90  34.49    4.11  36.39
315   4282    4881   322   18.01   1.86   0.62  18.94    3.11  57.45
322   4554    5192   305   24.92   0.00   0.66  27.54    2.30  44.59
332   4943    5635   491    9.16   0.41   3.46  30.35   15.07  41.55
351   5683    6478   323    7.43   1.86   3.41  50.15    8.98  28.17
358   5956    6789   366    5.19   0.00   1.37  55.74    7.10  30.60
378   6727    7632   331    5.44   0.00   6.04  54.38   10.57  23.56
397   7459    8432   252    1.19   0.00   2.78  67.46    8.73  19.84
418   8268    9315   197    0.51   0.00   5.58  57.87   10.66  25.38
438   9039   10156   134    1.49   0.00   3.73  55.22   17.91  21.64

Table 2. Pollen Percentage Table with Depths, Estimated Dates, Pollen Sum and Percents (Graphic version)





Entity  Chron  Default  Chron           Prepared    Date       Model
No.     No.    Chron    Name            By          Prepared

193     1      Y        COHMAP chron# 4 COHMAP      0000-00-00 linear interpolation

Table 3. Chronology Table (Graphic version)



Entity  Chron  Sample  Depth Thickness   Age   Age Up   Age Lo   R
No.     No.    No.     CM                yr.   yr.      yr.      Code

193     1      1       205                 0     50      -50     TOP
193     1      2       359              5995   6050     5940     C14
193     1      3       458              9810   9930     9690     C14

Table 4. Age Basis Table with Radiocarbon Dates and their Depths and Core Top Date (Graphic version)


Entity  Depth   AgeBP   CalBP  Sum  Picea  Abies  Pinus  Betula  Other  Other Trees
No.     CM      yr.     yr.                                      Herbs  & Shrubs

193     205.00     0       0   100  85.74   0.93   0.37  10.00    0.19   2.78
193     230.70  1000    1140   100  85.81   1.07   0.55   8.79    0.69   3.09
193     256.39  2000    2280   100  82.78   2.39   0.60  11.72    0.12   2.39
193     282.08  3000    3420   100  68.26   2.50   0.41  22.95    0.55   5.33
193     307.75  4000    4559   100  16.97   4.70   1.33  27.61    3.67  45.72
193     333.46  5000    5700   100   9.03   0.52   3.46  31.87   14.60  40.52
193     359.14  6000    6837   100   5.21   0.00   1.63  55.66    7.30  30.20
193     385.09  7000    7930   100   3.85   0.00   4.82  59.26    9.89  22.18
193     411.04  8000    9022   100   0.73   0.00   4.65  61.05   10.02  23.55
193     436.99  9000   10113   100   1.44   0.00   3.83  55.36   17.54  21.83

Table 5. Interpolated Percentage Table (Graphic version)

Part of PATools is dedicated to study of chronologies (View Chronologies). The user can analyze the chronology at a site by viewing the chronology and age basis tables (Tables 3-4), by ranking the quality of the chronology for a given time slice (Rank All Chronologies), i.e. the number and proximity of bracketing dates, and by testing and modifying chronologies using an interface (Test Chronologies) that instantly shows how adding, removing or modifying dates affects the estimated ages for pollen samples and displays simple age-depth plots.

PATools also uses dissimilarity measures to compare pollen samples and to find potential analogs (Find Analogs). Given pollen samples in two tables, a master "analog" table and a sample table, for which pollen percentages have been calculated using the same pollen sum, PATools will use a selected dissimilarity measure and threshold dissimilarity value to find the samples in the master table that closely match each sample in the sample table. The user can select from eight measures of dissimilarity (Canberra metric, information statistic, Manhattan metric, squared chi squared coefficient, squared chord distance, squared cosine-theta distance, squared Euclidean distance and squared standardized Euclidean distance) and can accept a default threshold value or choose a new one.

In order to check the impact of recent updates to tables in NAPD, PATools has a utility program to compare tables (Compare Tables). These comparisons will reveal differences between the new and old tables for a site if new data have been added, if old data have been deleted, or if tables have been corrected or modified. PATools can also convert between pseudo-decimal and decimal latitude/longitude formats (Convert Degrees), and it can number records and number samples (groups of records) within a table for easier subsequent access to the information (Number Records, Number Samples).

PATools works within the Borland Paradox 7 database management application program for Windows95 and WindowsNT on IBM compatible computers, and it requires only rudimentary knowledge of Paradox. The tool kit is not fully documented and requires a short period of experimentation. (A user guide is being written.) Requests for copies of PATools and questions concerning its use can be sent to Phillip_Leduc@Brown.edu.

For those who do not own a copy of Paradox 7, the Borland international web site, at www.borland.com/bww, suggests where to buy their products. In the United States, Surplus Direct based in Hood River, Oregon, sells Paradox 7 at an academic discount for about $90. To inquire, visit their web site at www.surplusdirect.com then select "Academic" and then "Database", or call (+)1-800-753-7877.


Copyright © 1998 Leduc, Phillip L., Williams, John W., and Webb III, Thompson
Home page
Newsletter 17 index
Author index
Subject index
WWW pages by K.D. Bennett