2015Jul14

From FilteredPush
Jump to: navigation, search


Etherpad for meeting notes: http://free.primarypad.com/p/8w8WlGVZ4z

Agenda

Non-Tech

  • Publications
    • Paul/James: Collection Objects
    • Paul/Bob/David: QC Reports
      • Bob: Progress on draft.
    • Bob: Refactoring Dup finding cluster analysis
      • Bob: Access to larger scale infrastructure
    • Bob: List of additional topics
  • Final Report
  • Schedule: Next call in 2 weeks.

Tech

  • Annotation Processor
  • State of Deployments
    • FP2.acis (SCAN, InvertEBase)
    • FP3.acis (NEVP)
      • State for harvest for NEVP
  • Morphbank integration
  • Habitat, Phenology Ontology work.

Notes

Present: Tim, Bertram, David, Bob, Paul, Jim. Agenda: Non-Tech

  • Publications
    • Paul/James: Collection Objects

James: Timely with Dina workshop coming up in the fall. MS hasn't moved since we last talked about it. Shouldn't be too far off, Paul and Gen need to take another pass.

    • Paul/Bob/David: QC Reports
      • Bob: Progress on draft.

Bob: In progress, have about 1/5th of text done. Would like to have in shape by next week. Focus is on presentation of QC results to data curators.

    • Bob: Refactoring Dup finding cluster analysis
      • Bob: Access to larger scale infrastructure

Bob: Largely on the back burner right now. Will be able to make assertions about discoveries here (on why the naive approach hasn't succeded) for the final report. Further work needs to go into Kurator. No response yet from high performance infrastructure folks. Paul: Fits well with the second major approach to data cleaning - looking at data sets in bulk, rather than as record by record. Need to get some tasks into Jira - something larger for duplicate detection and something smaller for followup on access to NCSA infrastructure.

    • Bob: List of additional topics

Paul: Need to get out one more publications summarizing what we've done in FP. Bob: Haven't done a good job about describing FilteredPush variants - light, medium. David: Haven't done a recent deployment of FP-lite. Various camel configurations at this point. Bob: One paper about success/failure of different modes of deployment. There's the generated java code that gets hand-edits, not quite by configuration for the domain vocabulary, then there's the wiring by configuration. Paul: What we've built is wiring by configuration (through camel), and domain vocabulary by configuration at build time (construction of annotation proxy objects in build). Something interesting to say here about open world and producers and consumers being able to communicate. Bob: Perhaps two papers here, one more technical, another to communicate to biologists.

  • Final Report

Due 90 days after the end of the grant. Jim: May or may not be able to build on annual reports, likely to be very picky in details. Bertram: Probably more work than the annual report. Jim: And more likely to be read more widely than annual reports.

  • Schedule: Next call in 2 weeks.

Let's have this call again next week.

Tech

  • Annotation Processor

David: Have a deployment on a local machine, is working with the SPNHC demo state driver, annotation processor updated to use the camel framework (e.g changes to messenger bean), can register interest, get annotations back. Seeing some hibernate configuration issues in the Driver. Bob: Which version of specify? David: Reasonably recent Specify 6 from Kansas.

  • State of Deployments
    • FP2.acis (SCAN, InvertEBase)
    • FP3.acis (NEVP)
      • State for harvest for NEVP
  • Morphbank integration

David: Haven't heard anything back from them in a while - we've got a test deployment working locally. They've described their deployment environment, we need to finish testing.

  • Habitat, Phenology Ontology work.

James: Joel and I have been experimenting with this, focus for future work, see a role here for annotations making assertions about the ontologies.