2014Mar26

From Filtered Push Wiki
Jump to: navigation, search


Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2014Mar26

Agenda

Non-Tech

  • Davis: Check for any billing held up by NCE, next week when Kristin returns.
  • James: TDWG session
  • James: Progress on Finishing the SemanticMediaWiki as FP client deliverable.
  • SPNHC (April 25)
  • InvertEBase

Tech

  • Report from Friday call
  • SCAN
    • Report David/Tianhong: Akka integration
    • Report David: Progress on updating deployment.
    • Query for harvested data and analysis results.
    • UI for annotating annotations.
    • Display of annotation on interests.
  • NEVP
    • Report David: Progress on updating deployment
  • Analysis
    • Tianhong: Progress on cleaning data with data
    • Report Bob: Progress on Duplicate Finding data mining.
    • Report Chuck: Duplicate detection integration into Specify workbench/Dina-Specify
  • Nodes
    • Report Maureen: Ingest progress.
    • Report David: Morphbank status.
  • Driver
    • Report David: Status of integration of last working driver version.
    • Report Maureen: Status of new driver approach.
  • For Friday

Reports

  • Jim: No further word from NSF in re our request for a second NCE.
  • Maureen: ironed out some issues in the keystone access code in FP-Core; added logging and Icinga notification support into harvester

Notes

FilteredPush Team Meeting 2014 Mar 26 Present: Bob, Chuck, David, Paul, Maureen, Jim, Tianhong, Bertram Non-Tech

  • Davis: Check for any billing held up by NCE, next week when Kristin returns.
  • TDWG session - still awaiting response from TDWG
  • Bob: Progress on Finishing the SemanticMediaWiki as FP client deliverable.

Bob: Settled on the week of May 6th, Have started a Wiki page for user stories/scenarioes.

  • SPNHC (April 25) Discuss targets next week?

Paul: Agenda item for next week.

  • InvertEBase

Jim: Looks like funding will come through at about 50% of proposed level. Tentative start date July 1. Tech

  • Report from Friday call

Maureen: Reviewed NEVP diagram and discussed how to integrate and test harvesting in NEVP. Discussed launch of analysies with Akka driven by ingest.

  • SCAN
    • Report David/Tianhong: Akka integration

David: Query on mongo to link the analysis result to the occurrence record, compatible with the output of Kepler. Tianhong: Most of the pieces are in place, still need to resolve some issues with the interfaces. David: Also still need to settle on how harvest will launch analysies - cron jobs or camel. Maureen: Integrated support for Icinga's passive service checks into the harvest process - harvest will write to a log file and Icinga's external command file. Have this set up and need to test.

    • Report David: Progress on updating deployment.

David: Updated FP2 with the latest version of the Node and access point, updated the client helper on Symbiota 2 for testing, includes response tab. Embedded stuff on FP2 working, move now to use mulgara for the taxon-interest matching.

    • Query for harvested data and analysis results.

David: No work yet. Tianhong: Akka output is parallel with Kepler's output, so if that works, then no mor work to be done on the Akka side there. David: Next step to test if queries can extract the desired query results from these data in Mongo.

    • UI for annotating annotations.

David: Response annotation UI up for test on Symbiota 2.

    • Display of annotation on interests.

David: In progress.

  • NEVP
    • Report David: Progress on updating deployment

David: Haven't updated the deployment yet, most of the supporting software is in place, don't have all the supporting pieces, haven't yet deployed the updated node software. Can bring Symbiota 3 up to date as well.

  • Analysis
    • Tianhong: Progress on cleaning data with data

Tianhong: Working on outlier detection actor Tianhong: May be able to use Mongodb as a warehouse or may need some separate fast indexing/mining data store. Bertram: Can we examine the spatial/temporal/collector space of the current set of harvested data - group by collector/location, etc. Can mongodb do group by queries to get an overview of the data like this? Example (try sth like this in MongoDB) select collectorId, [tripId,] min(coll_t), max(coll_t) from ... group by collectorId [, tripId] David: Will check that Tianhong has access to fp2 to try these on Mongo there, data is a harvest from SCAN symbiota as of about a month ago. Tianhong: Yes, have access to FP, collection occurrence has 56680 records. Paul: Sounds low - let's check against Symbiota4 SCAN db..

    • Report Bob: Progress on Duplicate Finding data mining.

Bob: Recoding vectorization.

    • Report Chuck: Duplicate detection integration into Specify workbench/Dina-Specify

Chuck: working on getting Specify and Specify-web up in a development environment - don't need necissarily to have the latter working in dev environment, can look at the bookmarklet approach on any running Specify-Web. Chuck: Got some very helpful feedback from David on getting the tomcat version of the data entry tool working.

  • Nodes
    • Report Maureen: Ingest progress.

Maureen: Not this week. Sent out summary of batch process.

    • Report David: Morphbank status.

David: Pending dump. Nico has images from scan in there - we need to help Nico on making them public.

  • Driver
    • Report David: Status of integration of last working driver version.

David: Working on annotation tabs/responses in symbiota.

    • Report Maureen: Status of new driver approach.

Maureen: Worked on keystore access code in FP-Core, got that ironed out.

  • For Friday
    • Turn diagrams into roadmap See: Media:SCAN.png and Media:NEVP.png
    • review list of Akka actors for SCAN
    • Tianhong's data mining approach to cleaning ingested data.
    • Review batch ingest process (maureen's email)