2010Oct02

From Filtered Push Wiki
Jump to: navigation, search


Meeting with Apiary team

User: Amanda Neill | User: Paul J. Morris | User: James Macklin | User: Jason Best | User: Bill Moen | User:
"User:" cannot be used as a page name in this wiki.
| User:
"User:" cannot be used as a page name in this wiki.
| User:
"User:" cannot be used as a page name in this wiki.
| User:
"User:" cannot be used as a page name in this wiki.
| User:
"User:" cannot be used as a page name in this wiki.
| User:
"User:" cannot be used as a page name in this wiki.



Aipiary term: BasisOfAnnotation: online cumulative evidence, specific specimen, unknown.

New determinations, most elements allready exist in dwc identifications. Apiary has paralell annotation class, identified by=annotated by, etc. Suggestion to dwc is to broaden identification to all annotations. John W. thinks this is plausible. Discussion, this would be more extensible if treated as higher level generalization, e.g. able to have motiovations out of MRTG and TCS with content out of DWC, or able to annotate in different domains (sensor data).

Apiary annotation metadata terms:

  • annotatedBy (verbatim list of agents)

(primaryAnnotator (interpreted primary agent from *annotatedBy, controled by authority)

  • annotationRemarks (verbatim text of annotation, particularly when transcribing from label on sheet).
  • verbatimDateAnnotated (verbatim text from label)
  • dateAnnotated (interpreted counterpart)
  • annotationReferences
  • dwc:associatedMedia e.g. link to genbank record, images online e.g. field images
  • dwc:associatedReferences
  • dwc:associatedSequences

Domain specific value for hasPurpose might be newDetermination (perhaps better newIdentification for consistency with dwc terms). Which hasPurpose is a new determination, not clear from the proposed term list: Proposal for Terms for hasPurpose

   * Interpretation
   * Augmentation
   * Correction
   * Refinement
   * Replacement 

Typing (annotation types on herbarium sheets, from the point of view of there being a primary object with primary data, and then has subsequent stuff happening):

  • TaxonomicAnnotations
    • Identification
      • Identification
      • Confirmation (e.g. person's name and exclamation point, morphbank's thumbs up).
      • Denial (e.g. morphbank thumbs down, or person's name and "not")
    • Typification
      • Typification
      • Confirmation (of type status)
      • Denial (of type status, Not A type).
  • NonTaxonomicAnnotations
    • Unknown (e.g. "0.2" written next to the flower...).
    • Accessions (apiary:previousOwners)
    • AttachedImage (Map, drawing, etc stuck on sheet).
    • Association of one object to another object, associated media, etc (which itself can be annotated) - surrogates, image of sheet, derivatives.

Approach in Apiary: Initial question are we modeling the sheet or the plant? Conclusion, there is a collection object that is the combination of the sheet and the plant. Large proportion of information exists because of this combination of a plant attached to a sheet. The sheet may move, may be barcoded, may be deaccessioned and accessioned elsewhere. Staring point for Apiary is an image of the sheet (), with transient states of history that are reflected on the sheet.

Apiary, clear workflow need for a primary object (herbarium sheet, with surrogate image), with set of primary (collecting event, initial determination) data, and subsequent data.

ROI's are interesting, relate tightly to cascade of annotations. Not clear if internally to apiary the region of interest makes sense as an annotation outside the workflow. Might be a use case scenario of someone later wanting to rexamine a particular label, e.g. for applying handwriting analysis software to a particular label's text that suggests a requirement for retaining the ROI and treating it as an annotation. By tracking the coordinates of a ROI and the coordinates of characters found by OCR it becomes possible to redact information from the image on the fly, and possible to use human corrections of the OCR as training information to the OCR engine. Also possible to code likelyhood of type of label by its location on the sheet if the location of ROIs are retained.

TO DO

Get example Apiary annotations. Map these onto annotation ontology, find what has logical places and what doesn't.