2013Dec18

From Filtered Push Wiki
Jump to: navigation, search


Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2013Dec18

Agenda

  • Summary of Friday Call
  • Analysis
    • Report: Current state of Kepler/Akka work.
      • CSV load/save actors.
      • Date Validation flowchart
      • Vertnet github issue filing (pending response from Vertnet).
    • Discussion: Duplicate Finding
  • Driver
    • Report: Development status
    • Report: Progress on Validation of ingest of all current annotation types (new determination, updated determination, new georeference, updated georeference, new occurrence, updated locality).
  • SCAN TCN Support
    • Revisit sanity check.
    • Planning for visit in week of Jan 6.
  • NEVP TCN Support
    • Annotations in OCR/Croudsourcing pathways
    • Report: Progress on implementation of OAI/PMH harvesting through firewalls.
  • iDigBio integration, possible schedule in February.
    • Discussion: occurrenceID as well as DarwinCoreTriplet, structure of selector(s)?
  • FP Infrastructure
    • Report: Status of FP Node Refactoring

Non-Tech

  • Burndown rate increase items.
    • Davis burndown rate.
    • Documentation and UI work.

Next Week

  • Duplicate finding
  • Driver
    • Plan for validation of future annotation types (updated habitat, phenological state descriptions).
    • Symbiota Driver, appears as need from SCAN.
  • Analysis
    • Discussion: Proposal for ranking annotations based on queries, not repeated analysis of same records.

Reports

Notes

FilteredPush Team Meeting 2013 Dec 18 Present: Bob, Maureen, Paul, David, Tianhong, Bertram, Jim, James. Agenda

  • Summary of Friday Call

Maureen: Went over Chuck's demonstration for the Hackathon (demonstrating storing a transcription as an annotation, querying for it, and likewise for a consensus assertion represented as an annotation) running on FP1.acis. Went over CSV input/output and date validation with Tianhong.

  • Analysis
    • Report: Current state of Kepler/Akka work.
      • CSV load/save actors.

Tianhong: Have changed code in akka to use the csv library. Bertram: For output of DarwinCore, what provenance terms are available in DarwinCore? If not, how do we add something inside the records to track the provenance. => discuss Friday? Maureen: ok. :) Paul: Record in this case is a spreadsheet row - or as we have had in the output delivered to the annotation processor, additional terms on one sheet and then more detail on linked sheets. Bertram: Yes, that's precisely the discussion we should have :-) Bob: Provenance from W3C are available, and can particularize to domain. Bob: Sounds much like the information Kepler adds as provenance. Bertram: Would like to free from Kepler, step back and look at needed provenance that works for us. Looking for terms to put inside the record rather than external to it.

      • Date Validation flowchart

http://wiki.filteredpush.org/wiki/Embedding_Kepler#Approach http://wiki.filteredpush.org/wiki/File:Date1.png Dissussion of issues/questions. Multiple things looking like data mining issues. Tianhong and Bob to coordinate on exploration of this.

      • Vertnet github issue filing (pending response from Vertnet).

Still pending response.

    • Discussion: Duplicate Finding
  • Driver
    • Report: Development status

Maureen: Still working on untangling UI from part of workbench that writes to database.

    • Report: Progress on Validation of ingest of all current annotation types (new determination, updated determination, new georeference, updated georeference, new occurrence, updated locality).

Paul: How are we going to set up this testing framework. Maureen: Expect that this test will be at level of integration testing of annotation processor with driver. More detailed tests needed within the annotation processor itself. Driver takes map of key value terms, much of annotation processor transforms annotations into these maps. Paul: Set up test set of maps, matching examples, to test the driver's ability to write as expected into the database. Then stub to run those into driver. Then embedd tests of correct construction of maps into the annotation processor. Bob: Specifying tests involves laying out what client requirements are trying to be met, case of rdf consuming clients and map consuming clients (driver). For first are the rules all met.

  • SCAN TCN Support
    • Revisit sanity check.
    • Planning for visit in week of Jan 6.
  • NEVP TCN Support
    • Annotations in OCR/Croudsourcing pathways
    • Report: Progress on implementation of OAI/PMH harvesting through firewalls.
  • iDigBio integration, possible schedule in February.
    • Discussion: occurrenceID as well as DarwinCoreTriplet, structure of selector(s)?
  • FP Infrastructure
    • Report: Status of FP Node Refactoring

David: Have a maven assembly that builds a subset of the libraries into a jar. Have remaining issues in not running in IDE with JINDI naming, where EJBs aren't being found by JINDI name outside the IDE. Looks like easier to deploy as a war than a jar, still need to do this assembly. Non-Tech

  • Burndown rate increase items.
    • Davis burndown rate.

Bertram: Sent query on to finanical folks, haven't heard back yet.

    • Documentation and UI work.

Jim: Now is the time to prioritize remaining tasks. Paul: Will meet with Damari shortly to find out what is feasable in contract/lite hire work. On schedule to rework the roadmap and focus on priorities in second week in January.