The problem
Natural history museum collection databases typically contain geospatial coordinates (latitude-longitude or UTM) for less than five percent (5%) of their specimens. Instead, most specimen localities, especially historical data, are in textual format,i.e., "2 mi S of Bakersfield." Without spatial coordinates these data are effectively unavailable to geospatial analyses, whether simple mapping or distribution modeling.



The process of deriving spatial coordinates from textual locality descriptions can be called retrospective georeferencing. Retrospective georeferencing can be quite laborious, requiring the interpretation of antiquated and often imprecise textual locality descriptions into latitude and longitude coordinates. This conversion of a vague locality description into a single subjective point creates the problem of false precision, often to a scientifically unacceptable degree. On the positive side, it is a process that only needs to be done once.

To date, different methods have been used for retrospective georeferencing, including automated gazetteer lookups and software packages that identify latitude and longitude on basemaps (such as MapTech’s Terrain Navigator). Since these methods involve the subjective placement of a single (x,y) location point by the user, they entail differing degrees of overgeneralization, false precision, and time spent. The California Academy of Sciences (CAS) has developed a georeferencing tool that facilitates the same process, using desktop ArcView GIS software.

The project
This site describes the California Academy of Sciences' ongoing retrospective georeferencing project, and provides a downloadable version of our georeferencing tool designed as an extension to ArcView 3.2 software.

The tool provides an interface for the retrospective georeferencing process. Via ArcView 3.2 software, it enables a user to display a filtered and sorted list of textual localities in a table floating over a digital base map. The user then can select a single locality from the displayed list and draw its "footprint" as a spatial object. When the user is satisfied with the footprint's shape and position, it is saved as a spatial object in a separate locality (GIS) database.



These shapes, stored in ArcView’s shapefile format, are used to derive two values: the centroid of the shape expressed as latitude and longitude in decimal degrees, and the shape’s span. The span provides a quantitative expression of the textual locality’s vagueness; imprecise localities will tend to be drawn with large shapes having large spans, and precise localities will tend to be drawn with smaller shapes having smaller spans. This relative measure improves on other methods of expressing coordinates’ precision levels as subjective categories.

Overall, this tool improves on previous retrospective georeferencing methods by a) increasing speed, b) maximizing consistency between users, c) allowing incorporation of interpretation standards established by collection managers, and d) quantifying textual localities’ vagueness.

Two downloads of the tool are available; one as a project set up with sample base maps and sample locality data, allowing exploration of the tool. The tool is also downloadable without base maps, allowing users to set up customized projects for geographic areas of interest.

We are interested in feedback regarding this georeferencing tool. Please contact Stan Blum at the California Academy of Sciences for more information or with comments.




©2001 California Academy of Sciences