2014Mar26
Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2014Mar26
Contents
Agenda
Non-Tech
- Davis: Check for any billing held up by NCE, next week when Kristin returns.
- James: TDWG session
- James: Progress on Finishing the SemanticMediaWiki as FP client deliverable.
- SPNHC (April 25)
- InvertEBase
Tech
- Report from Friday call
- SCAN
- Report David/Tianhong: Akka integration
- Report David: Progress on updating deployment.
- Query for harvested data and analysis results.
- UI for annotating annotations.
- Display of annotation on interests.
- NEVP
- Report David: Progress on updating deployment
- Analysis
- Tianhong: Progress on cleaning data with data
- Report Bob: Progress on Duplicate Finding data mining.
- Report Chuck: Duplicate detection integration into Specify workbench/Dina-Specify
- Nodes
- Report Maureen: Ingest progress.
- Report David: Morphbank status.
- Driver
- Report David: Status of integration of last working driver version.
- Report Maureen: Status of new driver approach.
- For Friday
- Turn diagrams into roadmap See: Media:SCAN.png and Media:NEVP.png
Reports
- Jim: No further word from NSF in re our request for a second NCE.
- Maureen: ironed out some issues in the keystone access code in FP-Core; added logging and Icinga notification support into harvester
Notes
FilteredPush Team Meeting 2014 Mar 26 Present: Bob, Chuck, David, Paul, Maureen, Jim, Tianhong, Bertram Non-Tech
- Davis: Check for any billing held up by NCE, next week when Kristin returns.
- TDWG session - still awaiting response from TDWG
- Bob: Progress on Finishing the SemanticMediaWiki as FP client deliverable.
Bob: Settled on the week of May 6th, Have started a Wiki page for user stories/scenarioes.
- SPNHC (April 25) Discuss targets next week?
Paul: Agenda item for next week.
- InvertEBase
Jim: Looks like funding will come through at about 50% of proposed level. Tentative start date July 1. Tech
- Report from Friday call
Maureen: Reviewed NEVP diagram and discussed how to integrate and test harvesting in NEVP. Discussed launch of analysies with Akka driven by ingest.
- SCAN
- Report David/Tianhong: Akka integration
David: Query on mongo to link the analysis result to the occurrence record, compatible with the output of Kepler. Tianhong: Most of the pieces are in place, still need to resolve some issues with the interfaces. David: Also still need to settle on how harvest will launch analysies - cron jobs or camel. Maureen: Integrated support for Icinga's passive service checks into the harvest process - harvest will write to a log file and Icinga's external command file. Have this set up and need to test.
- Report David: Progress on updating deployment.
David: Updated FP2 with the latest version of the Node and access point, updated the client helper on Symbiota 2 for testing, includes response tab. Embedded stuff on FP2 working, move now to use mulgara for the taxon-interest matching.
- Query for harvested data and analysis results.
David: No work yet. Tianhong: Akka output is parallel with Kepler's output, so if that works, then no mor work to be done on the Akka side there. David: Next step to test if queries can extract the desired query results from these data in Mongo.
- UI for annotating annotations.
David: Response annotation UI up for test on Symbiota 2.
- Display of annotation on interests.
David: In progress.
- NEVP
- Report David: Progress on updating deployment
David: Haven't updated the deployment yet, most of the supporting software is in place, don't have all the supporting pieces, haven't yet deployed the updated node software. Can bring Symbiota 3 up to date as well.
- Analysis
- Tianhong: Progress on cleaning data with data
Tianhong: Working on outlier detection actor Tianhong: May be able to use Mongodb as a warehouse or may need some separate fast indexing/mining data store. Bertram: Can we examine the spatial/temporal/collector space of the current set of harvested data - group by collector/location, etc. Can mongodb do group by queries to get an overview of the data like this? Example (try sth like this in MongoDB) select collectorId, [tripId,] min(coll_t), max(coll_t) from ... group by collectorId [, tripId] David: Will check that Tianhong has access to fp2 to try these on Mongo there, data is a harvest from SCAN symbiota as of about a month ago. Tianhong: Yes, have access to FP, collection occurrence has 56680 records. Paul: Sounds low - let's check against Symbiota4 SCAN db..
- Report Bob: Progress on Duplicate Finding data mining.
Bob: Recoding vectorization.
- Report Chuck: Duplicate detection integration into Specify workbench/Dina-Specify
Chuck: working on getting Specify and Specify-web up in a development environment - don't need necissarily to have the latter working in dev environment, can look at the bookmarklet approach on any running Specify-Web. Chuck: Got some very helpful feedback from David on getting the tomcat version of the data entry tool working.
- Nodes
- Report Maureen: Ingest progress.
Maureen: Not this week. Sent out summary of batch process.
- Report David: Morphbank status.
David: Pending dump. Nico has images from scan in there - we need to help Nico on making them public.
- Driver
- Report David: Status of integration of last working driver version.
David: Working on annotation tabs/responses in symbiota.
- Report Maureen: Status of new driver approach.
Maureen: Worked on keystore access code in FP-Core, got that ironed out.
- For Friday
- Turn diagrams into roadmap See: Media:SCAN.png and Media:NEVP.png
- review list of Akka actors for SCAN
- Tianhong's data mining approach to cleaning ingested data.
- Review batch ingest process (maureen's email)