2014May07

From FilteredPush
Jump to: navigation, search


Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2014May07

Agenda

Non-Tech

  • SPNHC (April 25/May 5), Both abstracts in, Acceptance heard on DemoCamp.
  • James: TDWG Symposium
    • Who to invite
  • James: FaunaEuropaea
  • InvertEBase
  • Request for second NCE
  • iDigBio, next actions?

Tech

  • Report from Thursday call
  • SemanticMediaWiki as FP Client
  • Driver
    • Report Maureen: Status of driver - current annotation processor integration.
    • Chuck: State of getting set up to work on annotation processor.
  • SCAN
    • Report David/Tianhong: Akka integration in FP2
    • Test of Akka workflow with SCAN data.
    • Query for harvested data and analysis results.
  • Nodes
  • FP-DataEntry
    • Report Chuck: Duplicate detection integration into Yale data entry application
  • NEVP
    • Report David: Progress on updating deployment.
    • Akka Workflow for NEVP
  • Analysis
    • Tianhong: Progress on cleaning data with data.
    • Report Bob: Progress on Duplicate Finding data mining.
  • For Thursday:

Reports

  • Paul
    • SPNHC abstract on data flow in NEBC submitted.

Notes

FilteredPush Team Meeting 2014 May 07


Present: Bertram, Tianhong, James, Jim, Chuck, Paul, Maureen, Bob

Non-Tech

  • SPNHC (April 25/May 5), Both abstracts in, Acceptance heard on DemoCamp.

Paul: Abstracts in.

  • James: TDWG Symposium

James: Symposium was approved. Talked with Anton in Berlin, have started a discussion on whom to invite and how to structure time.

    • Whom to invite

James: Discussion here, who from this group to talk, one talk or two, scope?

Bertram: Could talk about current work with Akka and plans for next round.

James would be good to do two talks, but would take up slots. Could push for a second 90 min block.

  • James: FaunaEuropaea

James: EU project (fauna and flora) talked with one of the fauna folks at Berlin who is interested in annotation and interaction with curators. They have funding and programmers, nice promotion in Europe if we can help them do something like we are doing for SCAN. Good to have a call with them?

Paul: Can any of them make it to SPNHC?

James: Can ask if they are.

  • James: Berlin Botanical Garden (German) Collaborations
    • Interoperablity of Annosys and FP

James: Questions about different use of OA between Anosys and FP.

Bob: Need to define what expectations for interoperability are.

David: Talked with Lutz about this a couple of months ago: Discussion was about a web service that would provide a transformation. In essence, providing a web service to client helper that understands their annotations.

    • Curation workflows

James: Biovel is ending in September, have continuing proposals in, but not heard yet. Came to same conclusion - Taverna isn't particularly useful for curation workflows in our community. Interested in what we are doing with Akka. Good opportunity to work together.

    • Promote Biodiversity Catalogue: www.biodiversitycatalogue.org

James: Useful place to find web services, good place for us to list services.

  • InvertEBase

Jim: Have budget together at new revised target. Waiting for word to upload to fastlane.

  • Request for second NCE

Jim: Program officer is OK with approving it, but formal approval was held up earlier because of an overdue final report, which has since been filed. Two additional reports--one final, one annual--are due as of May 1st for grants with Jim as PI. The final report has already been submitted. The annual report is for SCAN, which Neil Cobb is planning to submit in June. It's not clear if formal approval of our request for a second NCE requires that the SCAN report be submitted first.

  • iDigBio, next actions?

David: We need the information on how to do the checkin, need the place to deploy to go into production for Morphbank.

Bob: Specs for new VM for an FP node on his infrastructure supporting his croudsourcing project that keeps out of the way of things we are currently running.

Paul: Two tracks: One getting into production with morphbank, the other getting a VM environment for next steps with Greg - and how we are going to gather requirements for support of his purpose in supporting his croudsourcing workflow.

Tech

  • Report from Thursday call

Maureen: Discussed similar problem, how to test akka. Test success/failure being eyballing (maureen, paul, james) results of workflow on SCAN data. Use FP1 for now, Paul to get emails going to have test environment for us to use mirroring SCAN/FP1, NEVP/FP2.

  • SemanticMediaWiki as FP Client

Bob: Working with Joel to work with a lighter weight abstraction of the client helper to generate annotations and integrate this into the javascript support for MediaWiki.

  • Driver
    • Report Maureen: Status of driver - current annotation processor integration.

Maureen: Continuing

    • Chuck: State of getting set up to work on annotation processor.

Chuck: Not ready yet.

  • SCAN
    • Report David/Tianhong: Akka integration in FP2

Tianhong: Using dataset on FP2. Akka, on local workstation. Actors: MongoDB reader (FP2), MongoDB writer (FP2), Annotation Generator (FP1).

    • Test of Akka workflow with SCAN data.

Three collections in FP2 as result: ASU1905, ASU1950, ASU1978 for queries accrodingly

Paul: Good, we'll look at those tomorrow.

Tianhong: Suspect there may be small issues in Akka and the result represntation that need to be fixed, status flag may not be corresponding to the comment.

    • Query for harvested data and analysis results.
  • Nodes
    • Report Maureen: Status of ingests (taxon/occurrence) on FP2 and FP3

Maureen, Process is there, working on monitoring. Need access to Symbiota4 to install the OAI provider. Data for SCAN loaded. Data for NEVP ready to load. Got a configuration for Icinga working to monitor production ingest process.

Chuck: NEVP Configured for solr indexing.

Maureen: Need to run the scripts that ran on SCAN data on the NEVP data, and install the index on FP3. Currently only have solr running by manual startup. Need to get the firewalls in place.

    • Report David: Morphbank integration status.
    • Developer Documentation cleanup Category:Prototype Category:Obsolete Template:FP-Obsolete

Spent some time looking at the pages on the wiki, don't have enough context to clearly see how to categorize pages.

  • FP-DataEntry
    • Report Chuck: Duplicate detection integration into Yale data entry application

Chuck: Having the duplicate index for NEVP is blocking here. Service configured on Firuta right now.

Maureen: Should monitor solr as well.

  • NEVP
    • Report David: Progress on updating deployment.
    • Akka Workflow for NEVP
  • Analysis
    • Tianhong: Progress on cleaning data with data.
    • Report Bob: Progress on Duplicate Finding data mining.
  • For Thursday:
    • look at the Mongo results from Tianhong's latest run of analysis on data in SCAN prod; extract some to email to James
    • output of workflows does not match comments anymore?
    • list of everything we think should be running on fp2/3, accessibility, proxies, firewalls, icinga