Test Data Sets

From FilteredPush
Jump to: navigation, search

Paul's Three Data Sets

Webprojects has three Specify databases, one for each of Paul's three test datasets. The mappings of database names to data sets is:

  • testfrog: dataset a
  • testfish: dataset b
  • testfern: dataset c

The database names do not have any particular significance (i.e. there are not frog records in testfrog).

Umbfp has one Specify database:

  • testfish: dataset b

The duplicate (same collector number and collector name) records:

 | CatalogNumber | Barcode   | CollectorNumber | Collectors    | Taxon                   |

 a  & c 

 | 431           | 000000267 | 4887            | B. A. Krukoff | Tovomita krukovii       | 
 | 432           | 000000268 | 4887            | B. A. Krukoff | null null               | 

 b  & c

 | 448           | 000000284 | 1866            | G. Klug       | Graffenrieda colombiana | 
 | 453           | 000000289 | 1866            | G. Klug       | Graffenrieda colombiana | 

 a & b


Simple Data set of 20 records

Two tables with a one to one relationship, slightly more complex than DarwinCore, but only trivially so:

Collection object with barcode/catalog number and a determination and a type status and a collector's number Fields: BARCODE HERB_ACRONYM FORMAT REPRO TAXON TYPE_STATUS COLLECTOR_NO SITE_ID where site_id is the internal HUH database foreign key value for the site at which the collecting event where this specimen was collected, collector_no is the field number/collectors number assigned by the botanist who collected this specimen to the specimen at the time it was collected.

Tab delimited file: Image:Test1_collection_object.csv

Collecting event with a collector, locality description, and geopolitical placement. Fields: SITE_ID COLLECTOR START_YEAR START_MONTH START_DAY LOCALITY H_COUNTRY_NAME H_PRIMARY_NAME where start year, month, and day are the date collected, and h_primary_name is the name of the state/province level geopolitical entity within country where the collecting event occurred.

Tab delimited file Image:Test1_collecting_event.csv

select BARCODE, HERB_ACRONYM, FORMAT, REPRO, TAXON, TYPE_STATUS, COLLECTOR_NO, COLLECTOR, START_YEAR START_MONTH, START_DAY, LOCALITY, H_COUNTRY_NAME, H_PRIMARY_NAME from collection_object left join collecting_event on collection_object.SITE_ID = collecting_event.SITE_ID

Generated from ASA with query:

select barcode, herb_acronym, format, repro, taxon, type_status, collector_no,
       collector, start_year, start_month, start_day, locality, h_country_name, h_primary_name
from view_specimen_leftjoin_item, view_type_specimen_join_taxon, view_site_and_geo
where (collector_id = 138403 or collector_id = 109283 or collector_id = 104041)
and view_specimen_leftjoin_item.id_specimen = view_type_specimen_join_taxon.specimen_id
and view_specimen_leftjoin_item.site_id = view_site_and_geo.id_site
order by taxon_id