2015Apr21

From FilteredPush
Jump to: navigation, search


Etherpad for meeting notes: http://free.primarypad.com/p/BqyI5n6LjM

Agenda

Non-Tech

  • Progress from Meeting With AnnoSys
  • Publications
    • Paul/James: Collection Objects
    • Paul/Bob/David: QC Reports
      • Bob: Progress on draft.
    • Bob: Refactoring Dup finding cluster analysis
      • Bob: Access to larger scale infrastructure
    • Bob: List of additional topics
  • SCAN
    • Jim will submit annual report, using version provided yesterday by Neil Cobb.
    • Jim will submit request for no-cost extension. Nearly all other partners are doing the same.
  • Schedule: Next call in 2 weeks.

Tech

  • Maven repository
  • Annotation Processor
  • State of Deployments
    • FP2.acis
      • Status of InvertEBase setup
    • FP3.acis
      • State for harvest for NEVP
  • Morphbank integration
  • Habitat, Phenology Ontology work.

Reports

  • Paul
    • Patrick is updating the NEVP/CNH symbiota instance, he had a problem with applying the latest schema patches, Ed resolved that for him. He got some SVN conflicts on update of the PHP code, I resolved the blocking one, left some other differences, probably unimportant for Patrick to review.
    • Created an example agent (Asa Gray) in the NEVP instance for Patrick to see. http://portal.neherbaria.org/portal/agents/agent.php?agentid=1
    • Cleaned up and loaded the list of Entomologists that Chuck had compiled into the SCAN Symbiota instance as agents, along with some zoologists that are likely to be relevant to InvertEBase. http://symbiota4.acis.ufl.edu/scan/portal/agents/ also augmented and cross linked the two Agassiz agent records.
    • Provided a mechanism to correct the Chamberlin (1902) issue seen in COL data in GBIF in FP-KurationTools.
    • Added partial support for ZooBank's webservice in FP-KurationTools.

Notes

Present: David, Bob, Paul, Tianhong, Bertram, Jim.

Non-Tech

David: Have modified the annotation digest to satisfy anosys's needs (added annotation type, made consistent with OA). Can use to exchange annotations with them. Haven't gotten to "since" query parameter, need to write sparql and test. Need to add the state to the annotations in production (need to add to PHP client helper for symbiota.

  • Publications
    • Paul/James: Collection Objects

Paul: Haven't had a chance to work on Jame's latest draft yet.

    • Paul/Bob/David: QC Reports

Bob: Working on this MS. Focusing on undestanding date validator (easysest for general audience to understand). Bumping into things that need discussion:

(1) Date validator is too agressive, thinks date is invalid if a day is absent.

(2) Comparisons between event data and collector lifespan (strong data set for botanists, limited data set for zoologists) - may need to distinguish.

    • Bob: Refactoring Dup finding cluster analysis
      • Bob: Access to larger scale infrastructure

Bob: Haven't heard back from either of them.

Bertram: Talked with two people at NCSA recently in experimental supercomputer lab. Could set up a meeting.

Bob: Let's do that. Have shared overview with Bertram. Keyword: Mahout.

    • Bob: List of additional topics

Nothing further to touch on now.

  • SCAN

Jim: will submit annual report, using version provided yesterday by Neil Cobb.

Jim: will submit request for no-cost extension. Nearly all other partners are doing the same.

  • Schedule: Next call in 2 weeks.

Bertram put this interval into calendar, should be automatic.

Tech

  • Maven repository

Bertram: no update AFAIK (Rob Koop from NCSA

David: Believe Tim is still waiting on permissions. Key jars are available on the web mounted filesystem that Tim put them on.

  • Annotation Processor

David: Haven't been working on this, have been working on Kurator for the Webinar.

  • State of Deployments
    • FP2.acis

David: Haven't seen java heap space issues in the last two weeks. Has been stable.

Getting a full set of annotations from SCAN in current format.

Note, at end of call, saw issues with access point, appears to be getting hit with brute force attempts. Tighten firewalling rules.

      • Status of InvertEBase setup

David: Live, connected to FP2 node, no annotations created yet. Have to run current harvest to get the data for the current InvertEBase collection.

Bob: Interested in running FP-Akka on the InvertEBase data.

David: Haven't run analysis on full SCAN data set, have been working through collection by collection.

    • FP3.acis
      • State for harvest for NEVP

David: Having issues running the harvest. Ran into some service interuptions from the update of the DB last weekend. Seeing XML parsing exceptions on harvest, seems to be extra whitespace in unexpected places. Need to evaluate. May not have view alligned to the current schema. Also running into problems resuming from a previous failed harvest - OAI/PMH isn't providing an order, chunking uses tokens that expire, there appears to be support for harvest resumption that isn't enabled.

  • Morphbank integration

David: Deplyoing curernt Morphbank using documentation from Greg and Robert. Need some data, will ask Deb for a current dump. Next step is to test the current integration code (which is checked into a branch of Morphank, needs to be updated to trunk (not much code added, might be easier to make new branch from trunk copy small set of changes in, test, and then merge back into trunk.).

  • Habitat, Phenology Ontology work.

Paul: No progress yet. Need to talk with Patrick again.

Bob: Does pertain to year/month, no day issue as well.

  • Bertram: Got a reminder about the SPNHC Demo Camp factsheet. Need this info:
   Sample:
   - project website Arctos
   - purpose: Multidisciplinary collection management system for natural history collections. Combines collection management with public web presence to explicitly demonstrate collection usage and integration with other web services. 
   - intended users: Collection managers, curators, collection users, scientists, and anyone interested in collection records
   - classification: Database management system and web application
   - technology & integration: An integrated suite of applications written in ColdFusion and running over Oracle. Collection records in Arctos can be linked to any URI; current links include media hosted at the Texas Advanced Computing Center, GenBank records, and DigiMorph images. Arctos also integrates other web services such as BerkeleyMapper, and its data are automatically available via DiGIR to distributed database networks (HerpNET, MaNIS, ORNIS).
   - licensing model: Code is freely available (http://code.google.com/p/arctos/), but participating collections are strongly encouraged to participate in the shared instance.
   - platform requirements: Client: any modern standards-compliant web browser. Database: Oracle
       
  • SPNHC Kurator talk (this info not needed)
  • SPNHC poster (James et al)
  • Needed for both

- FP-Akka demo camp ==> need to locate abstract and prepare the required info (Title: A scientific workflowtool for targeted data quality improvement of natural science collection data.) - YesWorkflow demo camp: YesWorkFlow: How to Render a Data Curation Script as a Workflow in Under 10 Minutes