skip page navigationOregon State University

In the biology curriculum, species classification and identification are two important topics to help students understand ecology and evolution. Classification is the process of defining and naming classes of organisms based upon the similarity of their attributes, while identification is the process of assigning a specimen to a pre-defined class. Both classification and identification are based on comparative descriptions which are framed in terms of a list of characters. Each character is a set of states that distinguish amongst the organisms. The number of states may be discrete (e.g., number of antennae) or continuous (e.g., width of head). A series of characters with specified state values are organized as identification rules, also called the diagnostic keys, which have been in use for more than 200 years. In practice the use of diagnostic keys often requires that the users fully understand both the species characteristics and state values. It is usually a difficult and frustrating task for non-experts to use diagnostic keys. Traditional diagnostic keys are designed to mimic the thinking of professional taxonomists who can easily carry out the identifications because they are already familiar with the characteristics used in the keys.

With an increasing use of computer technology, many traditional diagnostic keys have been digitized and are available online (e.g., BugBytes -http://ipmnet.org/ent3/bugbytes/ and Lichen Synoptic Key - http://ocid.nacse.org/lichenland/synopticKey/index.php). Although a computer-based tool provides a more effective identification when used interactively, the information used for building the key system is still a text-based description. Some identification web sites include an image hyperlink to each attribute so users can compare the description to a real image, but the core search engine is still based on text matching to a diagnostic key. The technology of image analysis and image pattern recognition has not been implemented to any computer-based biological identification.

The goal of this project is to develop an online biological identification system using a method based on signature pattern matching. A signature pattern is obtained by detecting the line edge from a rectangular box specified by the user who intends to identify a specimen. It is desirable and yet remains a challenge for querying image data by finding an object inside a target image.