From Filtered Push Wiki
Jump to: navigation, search

Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2013Jul31

Reminder: Change of meeting time effective Sept 4: (12-1 Eastern 9-10 Pacific).


  • Third project programmer
  • CNH Workshop
    • Issues memorialized at:
  • Kepler
    • Taxon name cleaning - Homonyms
  • NEVP TCN Support
    • UVN site for AnnotationProcessor deployment
      • Visit Date?
      • Information/coordination needed before visit?
    • Status update (production deployment target date 2013 Aug 15).
    • Funding URI - needs resolution
  • SCAN TCN Support (Production node deployment target date 2013 July 31).
    • Status update on deployments.
  • MCZbase Driver


  • Kurator
  • Collaborations
    • Specify/Symbiota
  • Burndown.

For Future meetings


  • Paul
  • Bob
    • Drafted "Evaluation" section for manuscript


FilteredPush Team Meeting 2013 July 31

Present: Maureen, Jim, David, Tianhong, Paul, Bertram, Bob, James

  • Third project programmer

Starting August 12.

Jim: Make sure all the developers are on the same page before his first day.

Paul: For Maureen, put this discussion on the agenda for (after) the tech call on Aug 9.

Work in progress to refine this page and develop requirements from there.

  • Kepler
    • Taxon name cleaning - Homonyms

Paul: Added examples to [Scientific Name Validator]

James: Answered some of Tianhong's questions, but Paul needs to answer more.

Bertram: Progress on Implementation?

Paul: TODO, this afternoon, before 4:30, answer Tianhong's questions.

Tianhong: Wrote code to invoke the global names resolver, able to obtain fuzzy matches with confidence score, have questions about how to use these results.

Bertram: How about scaling issues? Scheduling web service calls, related issue with resumption of aborted workflow and reuse of partial results?

Tianhong: Aware of these issues, doing some testing of invocation of the remote services, asking questions about the scheduling of service calls.

Bertram: Important technical challenge that we need to solve.

Maureen: there are people that are integrating Kepler and Hadoop somehow; does Hadoop help with process resumption-kind of problems? Maybe?

Bertram: interesting thought; but I don't think so. We should find out. The Hadoop extenson w UCSD extension, Sven has experimented with it. Put on the table for Tianhong, for him to explore setting up a Hadoop environment for testing. Expect that our scalability issues won't be immediately sovled by this.

Tianhong: How scaling?

Bertram: Parallel jobs. Won't help for invocation of remote services, like driving with the handbrake on.

Tianhong: Pointer?

Bertram: See link on Kepler site ? [1]

Maureen: perhaps the wrapping of jobs in jars for Hadoop is the only part we would find useful. I agree that Hadoop is mostly for paralellizing, which is not what we need.

  • NEVP TCN Support
    • UVM site for AnnotationProcessor deployment

Paul: UVM has agreed to be site for testing annotation processor deployment, second case along with HUH.

      • Visit Date?

Paul: week of august 19 seems to fit best here.

      • Information/coordination needed before visit?

Maureen: Would be very nice if they can give us a snapshot of their database for testing before we go there.

  1. How many users of Specify do they have?
  2. How many workstations?
  3. How many Specify databases?
  4. What are the technical specifications of the workstation?
  5. Is Security enabled in Specify (can tell from a snapshot).
  6. Can you provide us with a snapshot (backup) of your Specify database?

Maureen: We need to establish a mechanism to get them upgrades of the annotation processor.

James: New specify version was released yesterday, makes it hard to avoid updates.

James: Would be happy to also be a test node. Good alternative case of multiple users.

Maureen: Do you use the Specify "Security" feature (limits which users can see which tables)?

James: Not sure on current installation.

James: Heather very up to date on how to configure Specify for Botany.

Paul: TODO: Will email this list of questions to UVM and set up Maureen as point of contact.

Paul: TODO: page of questions for a potential annotation processor installation site. Need to know to install the FP-AnnotationProcessor

    • Status update (production deployment target date 2013 Aug 15).

David: Everything in place on FP3, except for Fedora (on FP1) and the harvest.

David: Timeline for ingest of new occurrence annotations into annotation processor?

Paul: Aiming for testing sometime in August, not nesisarily by the 15th.

    • Funding URI - needs resolution

Paul: We had a resolution?

Bob: Need to look back at this, suspect there are still knowlege representation issues to be resolved. Funding object as motivation becomes a grant as an instance of a skos:Concept, seems odd.

Paul: Alternative proposed in another vocabulary to avoid motivation?

Bob: Yes, on table, vocabulary seemed to have some uptake.

Paul: Proposed a simplification of your proposal, is this OK.

Bob: Will look at this week for a thumbs up/thumbs down answer.

  • SCAN TCN Support (Production node deployment target date 2013 July 31).
    • Status update on deployments.

Paul: Contacted Alex about access to Symbiota1 for installation of harvester, client hepler, keypair, etc, and harvest. Haven't heard back yet.

David: FP2 in same state as FP3. Fedora still on FP1.

Maureen: Still need to commit latest changes to Harvester code.

Maureen: Mechanism for updating annotation processor?

Paul: Only SCAN deployment at this point is on FP2, under our control.

Paul: Reasonable statement of status: We are about 1 week behind in this deployment.

Agenda for Friday: taking a look at the bug about hardening the user admin piece of the Annotation Processor; prioritizing related tasks

Maureen: Need backup plan.

Paul: Specifically need backup plan for Fedora (annotation store).

David: Also annotation processor database? And Interests from Mongo?

TODO: Maureen and David to describe backups for Fedora, annotation processor DB, and MongoDB, send to Alex and Paul to develop a backup and restore plan.

  • MCZbase Driver

Maureen still needs to contact Brendan about an application user access issue, but she has outside-the-app access to the database

Paul: Firewall issues?

Maureen: Yes


  • Kurator
  • Collaborations
    • Specify/Symbiota

Paul: Specify has done their new (?final Specify Thick) relase.

  • Burndown.

Bob: Request to ?? about managing configurations. Have description of issues from David and Maureen. Any issues with these being distributed in this form, or should this be distributed? Do we have any funding for this?

Maureen, David: OK to circulate email as is.

Jim: Need to revisit the list of burndown priorties, can

Paul: TODO: Ask Kristin to redo burndown projection now that we have harder numbers.