From Filtered Push Wiki
Jump to: navigation, search

FP Functional Goals

  • Distributed annotation of distributed data.
    • Agnostic about domain vocabulary for the assertions of the annotation.
    • Annotations "pushed to" (matched to interests of) interested parties, especially original data publisher.
    • Easy for data publishers to act on assertions of annotations (e.g. error correction, addition of new data, store and act on relations such as botanical duplicate determination).
    • Preserve knowledge asserted in annotations.
  • Quality Control distributed data.
    • Kepler Workflow system can launch annotations (cf Kepler Kuration package, as external client or embedded)
    • Annotations can suggest invocation of workflows (with triggers, this is "Continuous Quality Control")
    • Embedded (Kepler) Workflow system can run quality control processes on harvested data.
    • Embedded workflow system can run clustering processes on harvested data.
  • [Find Duplicates] (domain specific use case)
    • Cluster potential Duplicate Botanical specimen records in harvested data.
    • Allow annotation of clusters and creation of consensus record view of a cluster.
    • Provide client UI mechanisms to rapidly retrieve and incorporate data from consensus while entering data.

Annotation Data Model

Major Components Involved in Annotation

(c) Paul J. Morris; CC-BY-SA; GNU FDL. Shows the path of an annotation from a source client to a consuming client. Does not include OAI/PMH data harvesting path.

Current Efforts

  • ApplePie (cf AppleCore group; Best practices for botany uses of DwC) FP Network architecture
  • FP Lite: Annotation generator.

Applicability to phenomix concerns

  • Ontologies - A wide range of functionality for creating and develop ontologies (e.g. anatomical) that are exportable to OBO formats
    • Little point; Community tools are mature
  • Phylogenetic matrices - A wide range of support for matrix development, including much utility coding very large matrices (e.g. 1000x1000)
    • Connections to specimens; CQC annotations invoking workflows?
  • Specimen metadata - Museum level specimen curation
    • Good fit now. Currently deploying several ApplePie networks
  • DNA workbench - Audit trails from specimen to sequence including generation of PCR worksheets and FASTA import
    • Maybe FP client?
  • Taxonomic catalogs - Data managed, updated, then presentable to then exportable to various formats (e.g. ITIS)
  • Taxon pages/treatments - Customize templates then add dynamic (e.g. matrix based descriptions) or text content and figures
    • Ditto ETC
  • Biological associations - For example- cataloging host-parasite records
    • Good fit, especially for QA/QC
  • Multiple-entry and bifurcating keys
    • Good fit especially for QA/QC; Also ETC is in this space