2013Jul10

From Filtered Push Wiki
Jump to: navigation, search


Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2013Jul10

Reminder: Change of meeting time effective Sept 4: (12-1 Eastern 9-10 Pacific).

Agenda

  • Annual report: "The Annual Report for award 0960535 has been approved by Peter H. McCartney on 07/09/2013."
  • CNH Workshop plans (July 19th, morning, Burlington, VT)
  • NEVP TCN Support
    • Status update (production deployment target date 2013 Aug 15).
    • Site for AnnotationProcessor/SpecifyDriver testing?
    • Duplicate Finding Find_Duplicates. State of old code.
    • In what ontology should we put the Funding URI Motivation and is its name appropriate as a Motivation?
  • SCAN TCN Support (Production node deployment target date 2013 July 31).
    • Status update on deployments.
  • Kepler
  • MCZbase Driver
  • dwcFP and DarwinCore RDF guide - need to provide immediate feedback.

Non-Tech

  • Third Project Programmer, Burndown.
  • Collaborations
    • Specify/Symbiota
    • Names

For Friday

  • SCAN TCN Support
    • Trees
    • Feedback from Nico

For Future meetings

Reports

Notes

FilteredPush Team Meeting 2013 July 10

Present: Tianhong, James, Jim, David, Paul, Maureen, Bob.

  • Annual report:

Jim: "The Annual Report for award 0960535 has been approved by Peter H. McCartney on 07/09/2013."

  • CNH Workshop plans (July 19th, morning, Burlington, VT)

James: Blurb has gone to Patrick. Patrick ask if wireless is OK.

James: 4 hours scheduled, with coffie break. We have wireless, in a classroom (110 person) blackboard, whiteboard, datajack(s?), two AV screens. Expect that many people will have computers. Estimate around 20 attendees.

Room description: There are two black boards in the back of the room. The entire front wall is a whiteboard from floor to ceiling, and there are two additional whiteboards on the side of the room. Each desk has 12 120V electrical outlets. There are two retractable AV screens in the ceiling at the front of the room. They are controlled by switches on the side of the instructor tech desk. The projector displays on the left screen, while the overhead projector can presumably display on the right screen. Underneath the tech desk, there are data jacks. Feature: A/C ADA Accessible AV Remote Blackboard Carpet Computer (PC)* Data Jack (active) Data/Video Projector* Doc Camera* DVD Player* DVD/VHS Player Food Not Permitted Furniture, Soft Instructor Tech. Desk Internet Cable* Overhead Projector* Podium (Tabletop) Projection Screen Seating - Chairs Seating - Moveable Seating - Tables and Chairs* Seating - Tiered Sound System Space for extra table Trash Bin Small VCR Player* VGA Cable Whiteboard Windows - Shades Wireless Internet

James Need Plan/schedule. Need to know how we are going to take notes. Need to have setup for multiple people to log in and work at the same time.

Bob: Travel arrangements, etc?

Travel up on Thursday afternoon. Travel back Friday afternoon?

Maureen: Train from Boston to Burlington.

James: Host hotel is expensive.

James: First we give overview. Second give a demonstration. Then look at annotation processing before lunch and workflows after lunch. Then work through on paper duplicates.

Schedule:

  1. Overview of FP/ApplePie
  2. Demonstration (annotation processing and workflow)
  3. Breakout groups (4): Login, Register Interest, Annotation Processing.
  4. break
  5. Breakout groups 4: Run analysis.
  6. Describe find duplicates
  7. Breakout groups 4: Work through duplicate detection/ find duplicates use case UI mockups.
  8. Talk with UMass and others about annotation procesor deployment.

Maureen: We should instrument the annotation processor to allow feedback via annotation (eat our own dogfood).

Bob: This is close to mandatory for some use cases. In particular, if a consuming app is not going to act according to the Expectation, it should normally launch a response annotation. Whether ACK is required when it does meet the Expectation is a separate question. Yesterday we discussed a case where Expectation is INSERT, but the consumer already has a record and should decline by saying "Please send me an UPDATE request"

(Maureen: in general, we have no idea how records are stored in a local database. we should just give the local database the dwc record and let it decide whether importing that dwc record results in an update or an insert) (Bob: but if it's capable of refusing, is that the end of story?) (Maureen: What purpose does the refusal serve? Why not just process it, if it knows what it wants is an UPDATE?) (Bob: (I believe that) there are race conditions that are detectable by receiver and maybe not to producer. E.g: P:here is a new determination D; C: D is an obsolete determination, I have since accepted it and then updated it to D2. I am not going back to D)

Paul: Could have scenarios set up on wiki and capture feedback on wiki. Also capture on paper.

Maureen: What test installation do we want to use.

Paul: Firuta (needs updating). FP1, FP2, FP3 (SPNHC demo).

Discussion: VMs:

symbiota1  -> SCAN symbiota production
symbiota2 -> development
fp1 -> development
fp2 -> |     -> SCAN FP node.
fp3 -> clone to fp2 -> use at CNH -> NEVP FP Node
nevp symbiota -> NEVP symbiota production
nevp symbiota test -> use at CNH? 

TODO:

  1. Ask Alex to clone FP3 onto FP2
  2. Ask Alex/Patrick for access to nevp symbiota test machine
  3. Load NEVP data into Specify instance on FP3 for CNH testing (annotation processing).
  4. Load NEVP data into MongoDB on FP3 for CNH testing (workflows).


  • NEVP TCN Support
    • Status update (production deployment target date 2013 Aug 15).

For next friday (July 17) - node infrastructure ready for testing, connected to test symbiota instance (like SPNHC demo, but with NEVP data).

For Aug 15, add in production annotation processor and specify at 2 NEVP sites (Harvard and one other (UMass?)). Add in ingest of new occurrence annotations from primary digitization apparatus to symbiota and specify.

    • Site for AnnotationProcessor/SpecifyDriver testing?

Paul: Talk with people next week?

James: can also ask who else in CNH would like to participate when.

Bob: Not ready to report on this yet. Seems like a good bit of overkill of dependency injection from UI to index searches. Probably can simplify to library calls.

Bob: Java code implements a number of standard deduplication algorithms. Would be better if we can replace this with calls to standard libraries.

James: Key element is speed.

Bob: Two archtectures probably do not have implications for this.

Discussion: General opinion, use of standard library or refactoring zhimin's code into a library seems better for development than maintaining current dependency injection.

    • In what ontology should we put the Funding URI Motivation and is its name appropriate as a Motivation?

Paul: Stake in the ground: dwcFP.

Bob; Motivation in OA can provide some weight to the action that consumers may take. Motivations that extend OA need to go somewhere. Which place makes the most sense to put this Funding motivation...

Bob: Two candiates: OAD and dwcFP. Funding seems equaly unlikely fit to either.

Paul: Other alternative would be an additional ontology containing just this term.

Bob: Also library community ontologies for epublication etc, may have term. Need to be very careful about reusing other vocabularies (can bring in undesirable imports).

Paul: Still needs knowledge representation examination.

  • SCAN TCN Support (Production node deployment target date 2013 July 31).
    • Status update on deployments.

David: Symbiota working on Symbiota 2 with annotation processor on FP3. Current deployment uses older configration.

Paul: Cloning FP3 onto FP2, then gives good path to using Symbiota1 + FP2 for SCAN.

David: Need to determine which VM to repurpose for testing analysis. Could ask for more RAM on FP1, or could use Firuta.

Tianhong: Question is how to deal with homonyms

James: Haven't heard response from David Remsen. Do have a taxonomist with some data that needs higher taxon allignment.

Paul: Can we get some test data to provide to Tianhong?

James: Should be able to.

Paul: We need more (and better) examples as test cases (perhaps ping Nico and Patrick for a few examples).

Paul: Wiki pages with relevant information: Embedding_Kepler#Steps Use_Case_Scenarios#Scientific_Name_Validation

Todo: Tianhong to take a shot at flowchart for scientific name validation. Paul and James to provide more examples of homonyms as test cases.

  • MCZbase Driver

Maureen: Brendan has made dump of database, need to login and check that all is ready to start work.

  • dwcFP and DarwinCore RDF guide - need to provide immediate feedback.

Paul: Have let relevant people know that feedback is coming. James/Joel/Paul/Bob had call to discuss some issues.

  • Third Project Programmer, Burndown.
  • Collaborations
    • Specify/Symbiota
    • Names