2009Jan6

From Filtered Push Wiki
Jump to: navigation, search

Maureen has been installing Hadoop on three HUH machines - recent stable version 1.7.x works fine on single machine, having problems connecting between the three.

Technology problems connecting to Kansas... Finally got thorough.


Discussion of [Secenarios_for_CBoL]. Duplicates - no need to barcode multiple copies of the same duplicate specimen where high confidence of this. But reasonable to barcode cases where some but low confidence that specimens are duplicates. FP message - are there any barcodes for this duplicate set? Yes - with high confidence, don't barcode, conversely answer is no or low confidence, then barcode.

CBoL db should have web service allowing to check if a particular collection object has been barcoded. Alternate case here is pushing new determinations to that db? May be way out for CBoL from having to track all applications of taxon concepts of species. Lets voucher specimens get new determinations, and these determinations be tied to sets of barcodes? May provide ongoing QA by taxonomists for CBoL data. Barcode can be just another attributed that can be shared amongst FP clients.

Key bit: FP client can serve as QA agent to check for contradictions between barcodes and specimen data (symmetrically). Applies generaly to any FP application - able to identify contradictions. FP is a convienent way to launch lots of queries against data holders to identify data inconsistencies e.g. for all barcodes in this genus, are there data inconsistencies in related specimen records.... Quality assurance applications are probably the most interesting more generally about community networks - getting all the people how are impacted and or able to impact the results together. Genrically - publication/subcription


Discussion of simple client applications to socally tagged images in minalto gallery or flickr for QA of tags and auto generation of machine tags for taxon names from tags.

FP can be cast as a community mechanism for quality control of distributed data.

James requerying BBG for dataset of duplicates.

Next meeting, next week (hopefully a demonstration of three hadoop installs interacting with FP messages).