2013May22

From Filtered Push Wiki
Jump to: navigation, search


Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2013May22

Agenda

  • Room Conflict: Need to change meeting time for the Fall Semester (1-2:30 Eastern?)
  • Project/Package Refactoring
  • Progress on SPNHC demonstration
    • Driver
  • Annotations
    • Progress on rewriting dwcFP, OAD, and example annotations.
  • MCZbase Driver
  • Kepler
    • Canadensis library
    • Taxon name cleaning
    • Provenance and rendering
    • Duplicate Finding

Non-Tech

  • Annotations
    • Annotation MS
  • Collaborations
    • Specify/Symbiota
    • SCAN TCN
      • Niko is looking for a FP/SCAN update this week.
    • NEVP TCN

For Future meetings

Reports

  • Paul
    • With Bob worked on revisions on OAD paper.
    • Two SPNHC abstracts in, one for DemoCamp (accepted allready), one for a Presentation.
    • NEVP TCN interaction with iPlant is settling on providing a GUID with images to iPlant in upload, then retrieving iPlant's assigned GUID for the image to disseminate with the metadata documents.
      • Path looks like it will be: " iPlant constructs a GUID for the uploaded image, NEVP Symbiota portal looks up this GUID by querying iPlant for metatdata that also uniquely identifies the image (and persists and uses this iPlant minted GUID to construct URIs for the images). The digitization apparatus can mint a UUID, provide it in the metadata for the images to iPlant, then Symbiota can query iPlant using this UUID and retrieve the iPlant minted GUID by which the image can be retrieved from all iPlant services."

Notes

FilteredPush Team Meeting 2013 May 22

Present: Bob, David, Paul, Maureen, James, Tianhong, Bertram

  • Room Conflict: Need to change meeting time for the Fall Semester (1-2:30 Eastern?)

Probably starting at the end of August.

Bertram's Fall Schedule: (Wednesday is good day, propose 1-2:30 Eastern OK)

Hard conflict: Tuesdays and Thursday 12-2pm Pacific Time

Soft conflict (Disc Sec) Monday 3-4pm, Tuesday 2-3pm

James: That time does conflict with a bi-weekly, optional meeting.

Bob: How about 9AM pacific, noon Eastern?

OK with James and Bertram.

  • Project/Package Refactoring

Maureen: Removing buisness logic from glassfish deployed projects, and putting it into a single non-glassfish project.

Goal is to be able to develop without having deployment running in glassfish (to simplify development and to simplify debugging). Commits done for all the projects except FP-Node. Documentation changes pending.

  • Progress on SPNHC demonstration

Plan A: Run stack on VM in Florida (David: use fp3.acis), with annotation processor and specify running on demonstration laptop.

Plan B: Entire stack running on demonstration laptop.

    • Driver

Maureen: Pending refactoring of projects.

  • Annotations
    • Progress on rewriting dwcFP, OAD, and example annotations.

Bob: Making sure that all of the examples are consistent in their constructions. Have changed oad and dwcFP namespaces to be consistent with w3c practices.

Goal in namespace change is to deliver both owl and html documentaton from the namespace address on the webserver using content negotiation.

Paul We need to pass on namespace change in NEVP new ocurrence example to Patrick.

  • MCZbase Driver

Maureen: Access now to port 80 on VM, still need oracle database.

  • Kepler
    • Canadensis library

Tianhong: More discussions with David, writing an actor to use the cleaning library.

    • Taxon name cleaning

http://wiki.filteredpush.org/wiki_Case_Scenarios#Scenarios_for_Quality_Control

Maureen: Got response from Jim, James has added to this and put int to wiki (link above).

Tianhong to take the following and produce a draft workflow/flowchart document. Use Maureen as conduit to collect more information. http://wiki.filteredpush.org/wiki/Use_Case_Scenarios#Scientific_Name_Validation

James: There is a logical order to this.

    • Provenance and rendering

David: view of first level spreadsheet, styling issues resolved, working on second level. Posible, with a custom data export component, to produce an excel spreadsheet with crosslinked pages - depends on availability of libraries.

James: Build a multipage spreadsheet for DNA/Specimen workflow management, includes a Java bit for validation. ftp://ftp.agr.gc.ca/pub/outgoing/bio-grdiqis/

David: Feeding back to Tianhong about desired changes to JSON output.

Tianhong: Modifying structure as appropriate. Style separated out from text. We are still using one spreadsheet per record on second level?

David: in the JSF visualization, using expandable rows, with second sheet as expansion of a row.

    • Duplicate Finding

James: Very excited.

James to review what's on the wiki.

Paul: One (simple) starting point: http://wiki.filteredpush.org/wiki/Concepts#For_Finding_Duplicates

Paul: Well fleshed out usecase at: http://wiki.filteredpush.org/wiki/Find_Duplicates

Paul: Suggest two areas for engineering: (1) what are duplicates and how do we find them. (2) how do we engineer a system to do what we want with duplicates.

For (1) James to hammer on the page: http://wiki.filteredpush.org/wiki/Concepts#For_Finding_Duplicates

For (2) Bob to hammer on http://wiki.filteredpush.org/wiki/Find_Duplicates

Tianhong: Have an alternative implementation of the workflow that runs much faster than the Kepler workflow.

Bertram: Implemented by Sven in Java. Put on agenda for Friday.

Non-Tech

  • Annotations
    • Annotation MS

Bob: On schedule for submitting on Tuesday. Please return any comments before Sunday.

  • Collaborations
    • Specify/Symbiota
    • SCAN TCN
      • Niko is looking for a FP/SCAN update this week.
    • NEVP TCN

Discussions with iPlant are settling on a procedure for determining the GUIDs of images stored in iPlant.

BL: Are we considering alternative protocols? (e.g. this one mentioned by Paul: https://dukgo.com/blog/xmpp-services-at-duckduckgo )