2014Feb19

From Filtered Push Wiki
Jump to: navigation, search


Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2014Feb19

Agenda

Non-Tech

  • Davis: NCE
  • James: TDWG session

Tech

  • Report from Friday call
  • Analysis
    • Report Tianhong: Progress on updated Kepler Kuration release
    • Report Tianhong: Akka Actors.
    • Report Bob: Progress on Duplicate Finding data mining
    • Report Chuck: Duplicate detection UI.
    • Discussion: Firuta for duplicate detection rollout?
  • Nodes
    • Report Maureen: Ingest progress.
  • NEVP
    • Report Paul: CNH/NEVP Symbiota portal updates going live.
  • Driver
    • Discussion: Driver

Reports

  • Paul
    • A little more work on Symbiota/Specify-HUH ingest for NEVP, another bug fix from testing by Patrick.
  • Chuck
    • Did another demo for Michaela, and integrated her feedback.
    • Went through the Lichen Portal dump and provided explicit mappings for a few more titles.
    • Misc clean-up
    • Next: I want to move more of the configuration to commandline arguments. I've spent enough time tweaking our particular subset of DwC to be convinced that the enumerations, as it now stands, would need to be tweaked for every installation, and I want to make that easier.

FilteredPush Team Meeting 2014 Feb 19 Present: Tianhong, Maureen, Chuck, David, Paul. Agenda

  • Report from Friday call

Maureen: Discussed integrating data mining approaches with data mining workflows. Assigned task of exploring locality outlier detection using Lei's approach to Tianhong.

  • Analysis
    • Report Tianhong: Progress on updated Kepler Kuration release

Tianhong: Mechanics of release all worked out. Discussed: darwincore archive loading issues. Discussed annotation generation actor toseparation to not write to FP network, not blocking issue.

    • Report Tianhong: Data mining approach to date outliers.

Tianhong: Put up a wiki page with some discussion. http://wiki.filteredpush.org/wiki/Embedding_Kepler#Approach see Problem 3. Maureen: Could we do a visualization of the data and allow humans to do the data mining? Tianhong: Lei had a visualization - - geographic visualization. Maureen: How do you make outliers visible when you show more than one collector at a time? Tianhong: Humans could do the detection, but likely to be inefficient - if in a pipeline workflow, need to pause the workflow for the human input. Maureen: Workflow could create a visualization, and users explore that, separate from the workflow. Chuck: Rather than a text description of this seems weird, a visualization. Maureen: Question is how to put it into the problem space of visualization, rather than the problem space of " what's an outlier". Maureen: How about derivatives - not space, but rate of travel in space...

    • Report Chuck: Duplicate detection UI.

Chuck: Did another demo for Michaela last week: More fields to map, and only populate from controled vocabulary if exact match. Good to have people actually work with it now. Chuck: Next step, change some of the hard coded domain configuration to configuration files. Maureen: Advancing into the specify web application. Several things making it difficult - access to multiple forms at once an issue.

    • Discussion: Firuta for duplicate detection rollout?

Requires: Rapid data entry code from trunk (php). Duplicate finding server application, java, has own jetty servelet. Deploy on Pau'ls workstation for folks to test. Use the production rapid data entry application.

  • Nodes
    • Report Maureen: Ingest progress.

Maureen: Thinking of more php rather than views to collect data for the oai provider.

  • NEVP
    • Report Paul: CNH/NEVP Symbiota portal updates going live.

Patrick has CNH/NEVP production server up to date with two test ingest batches live.

  • SCAN
    • Hackathon

Ed has merged hackathon branch back into trunk.

  • Driver
    • Discussion: Driver

Paul: Proposal: Go back to the driver code as workign with specify last fall, and update tha annotation procsssor to use the current infrastructure. Maureen: First point to tackle: How to select your driver? Simplest case is if the annotation processor is deployed on the same server as the database, configured by deployment, not by the user. Non-Tech

  • Davis: NCE
  • James: TDWG session