ApplePie

From Filtered Push Wiki
Jump to: navigation, search


Project Charter Project Roadmap Requirements ApplePie

About Apple Pie

Apple Pie is a specific implementation of the Filtered Push architecture designed to support the management of botanical specimen data. Special emphasis is placed on

  • re-use of curation effort by sharing that effort across holders of botanical duplicate sheets,
  • continuous data quality control by notification to cooperating specimen data holders of putative errors in their data, along with semi-automatic correction of the data under supervision of collection managers,
  • notification to interested parties of new taxonomic determinations of specimens, along with the provenance of those determinations.

Apple Pie Rules

ApplePieRules

Non-FP Software components of Apple Pie

  • Specify 6
  • Morphbank
  • Symbiota
  • DataONE
  • FP Reference Implementation Web Client

High-level and UML overviews of interactions among FP and non-FP software and services

Specify-Symbiota-MorphBank-DataONE-FP.jpg

Draft UML representation

1. Specify sends specimen data to Symbiota

Symbiota harvests new and modified specimen records from Specify in csv/dwcA format via existing support scripts and protocols:

  • SQL with direct database-to-database connection
  • CSV upload from Symbiota web form

http://symbiota.org/tiki/tiki-index.php?page=Data+Interoperability http://symbiota.org/tiki/tiki-index.php?page=Specimen+Upload+Procedure

Symbiota mentions interoperability with Specify6 here: http://symbiota.org/tiki/tiki-index.php?page=Specimen+Management , however, there is no mention of procedure details. Since I am unaware of DiGIR support in Specify6, I assume the most direct procedure is to use the Specify client's QueryBuilder to create a query that returns a result set, which can be exported as a spreadsheet, which could be uploaded to the Symbiota instance through their web interface.

2. Specify sends specimen images and data to MorphBank

The Specify client MorphBank plugin uses the MorphBank restful http API to upload new specimen images and data in MorphBank's xml schema.


3. Symbiota user authentication with DataONE

Modifications to Symbiota's web client would use the DataONE authentication API to obtain an authentication token that could be included in annotation messages sent to FP.


4. MorphBank user authentication with DataONE

Modifications to MorphBank's web client would use the DataONE authentication API to obtain an authentication token that could be included in annotation messages sent to FP.


5. Symbiota to FP


Annotation Messages

Here are a two options:

  • Modifications to Symbiota's web client would create a new form to allow users to create annotation messages to be sent to FP via an API over http in FP's xml annotation message schema.
  • Modifications to Symbiota's web client would generate html with dwc microformat cues to allow an FP browser plugin to create annotation messages to be sent to FP via an API over http in FP's xml annotation message schema. http://en.wikipedia.org/wiki/Microformat


Specimen Data

Here are four options for an FP search of "global knowledge:"

  • FP performs a federated search with a direct connection to the Symbiota database.
  • FP performs a federated search over http via a new Symbiota search API.
  • FP harvests records from Symbiota via a modification of their ingest mechanism that would send new data to FP.
  • FP harvests records from Symbiota via a new Symbiota harvest API.

http://www.morphbank.net/schema/API1.html

6. MorphBank to FP

 

Annotation Messages

Here are a two options:

  • Modifications to MorphBank's web client would create a new form to allow users to create annotation messages to be sent to FP via an API over http in FP's xml annotation message schema.
  • Modifications to MorphBank's web client would generate html with dwc microformat cues to allow an FP browser plugin to create annotation messages to be sent to FP via an API over http in FP's xml annotation message schema. http://en.wikipedia.org/wiki/Microformat

Specimen Data

Here are four options for an FP search of "global knowledge:"

  • FP performs a federated search with a direct connection to the MorphBank database.
  • FP performs a federated search over http via the existing MorphBank search API.
  • FP harvests records from MorphBank via a modification of their ingest mechanism that would send new data to FP.
  • FP harvests records from MorphBank via a new MorphBank harvest API.


7. FP authentication verification with DataONE

FP would use DataONE's authentication API to verify the authentication token in annotation messages received from Specify or Symbiota.

http://mule1.dataone.org/ArchitectureDocs-current/design/UseCases/12_uc.html


8. FP sends notifications to Specify

Here are two options for Specify to receive notification messages from FP:

  • The Specify client is modified to create a new plugin that polls FP for new messages in an FP xml notification message schema, over http using an FP API.
  • The Specify client is modified to create a new plugin that opens a direct tcp/ip connection to FP to listen for new messages in an FP xml notification message schema, over http using an FP API.


9. FP sends search results to Specify

 

Specify sends search request to FP

Omitted from the diagram for lack of space for arrows is the search request message sent to FP. The Specify client is modified to create a new plugin that sends a search request in an FP xml query message schema over http using an FP API.

Specify receives search results

The Specify client is modified to add a feature that can parse, display, and ingest search results received from the FP search API in an FP xml search results message schema.

Sequence diagram of alternatives resulting from a DataCurator's processing of an Annotation


Lower level views of architecture


Components of ApplePie instance of FilteredPush Network
Cartoon of Components of the ApplePie instance of a FilteredPush Network for the NEVP TCN
Cartoon of Components involved in the ingestion of an annotation from the the ApplePie instance of a FilteredPush Network for the NEVP TCN into a Specify6 database.


Messaging and Notifications

FP Messages

Notifications

Schedule

  • SPNHC demo week of 2012-Jun-10
  • Project complete 2013-Apr-01

Outline of next steps

Roadmap

The Ecosystem

  1. identify the main non-FP software components (see above)
  2. obtain specifics for the high-level flow of data among FP and the non-FP software components (see above)
  3. synthesize documents on the wiki; include the requirements list; include a review of requirements list with some specific demonstration scenarios-- relate demo requirements to specific use cases (Paul and Maureen by 2011-Nov-01)
  4. identify the data object types that will be the basis for searches and search results (DISCUSS AND ASSIGN ON 2011-NOV-01)
  5. for each of the data object types mentioned above, identify a format for representation (Paul)
  6. set up a test environment
    1. for each component that we can download and install locally, obtain the component and create documentation and scripts for installing and running
      1. Specify(Maureen will create the documentation and scripts for Specify over the next week)
      2. Symbiota
    2. for components we can't download and install locally, obtain connection information and credentials for testing
      1. MorphBank
      2. DataONE
    3. for each component that requires sample data in order to test it, obtain test data, and create documentation and scripts for loading sample data\
      1. Specify
      2. Symbiota
    4. for each component, create scriptable tests that verify proper installation of each component of the test environment.
  7. analyze the non-FP components and identify changes or protocols needed
  8. stub out the interfaces between components; start with tests

The Mapper

  1. figure out the content and structure of the annotation
  2. pick up some typical cases of annotations and databases, to test whether the current design of the mapper works or not, especially for the key steps. If it doesn't work, then we need to change our design, otherwise we could go to next step.
  3. make charter for the mapper
  4. make detailed design from the conceptual design
  5. implementation and test
  6. integration with FP network

Instructions for Developers

Here are some notes on how to build, install, and run some of the non-FP open-source software involved in Apple Pie.

Deployment

Initial FP Network

Ontologies

Use: Namespace_Convention_in_AO

See: AOD_Extension_of_AO_for_Data

Specify

Installing Specify

Symbiota

Installing Symbiota

DataONE

DataONE Overview

Authentication API

Authentication Overview

MorphBank

Installing Morphbank

Redstore

Triple store suitable for testing, but not production deployment.

RedStore Installation on firuta

Annotation Generator

DeterminationToRDF Tool

Rdf Handler

Job Implementation

Implementing a HelloWorld service