NSF Funding Abstract

Filtered Push: Continuous Quality Control for Distributed Collections and Other Species-Occurrence Data. NSF Award 0960535

Harvard University is awarded a grant to develop a networked solution to enable annotation of distributed biological collection data and to share assertions about their quality or usability. Internet inquiries that are posed to multiple datasets may yield varying results depending on the suitability or quality of the targeted data. In some cases it might be possible to inquire of experts or software agents that can assist in determining the fitness for use; in other cases such experts or agents might already have recorded an assessment of the data. However, that information is not typically available to the originator of the query. The proposed system will make these value-added assertions accessible to the end users of biodiversity datasets.

The Filtered Push network uses natural science collections as a reference implementation for a cyberinfrastructure with which any community can render an expert opinion about the quality of data, and the fitness for use of a data set or a subset of records. The emergent knowledgebase of the Filtered Push network supports the ability of interested parties to get immediate or historical access to these annotations, filtered by criteria expressing constraints on their interests. The network can also provide for the automatic execution of scientific workflows triggered by expert commentary, by the introduction or discovery of new data, or by a change in scientific viewpoints. As with the annotations, the outputs of such workflows can be distributed to interested parties, software or human. Filtered Push networks therefore allow for continuous quality control by the scientific community, based on human expertise, statistical or logical machine reasoning or advances in the domain science itself. The Filtered Push project maintains a wiki at http://www.etaxonomy.org/mw/FilteredPush. This project is part of a 10-year effort to digitize and mobilize the scientific information associated with biological specimens held in U.S. research collections. The images and digitized data from this project will be integrated into the online national resource as outlined in the community strategic plan available at http://digbiocol.files.wordpress.com/2010/05/digistratplanfinaldraft.pdf.

