From Filtered Push Wiki
Jump to: navigation, search

Next meeting 2011Oct25 due to TDWG meeting.

Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2011Oct11



  • Bob:
    • draft xslt-based generator of dwcFPModel from GBIF DwC Occurrence Extension committed to svn
    • email with Lei and call with Lei and Paul about dwcFPModel and impact on mapper (not much, with care).
  • Paul:
    • Discussions with Lei about rules to apply to annotation documents in ApplePie


Relative to my ApplePie proposal on the agenda, I have some email from Markus that clarifies some things about some standard DwC schemas. The original question from me was poorly framed and irrelevant --Bob Morris 01:14, 11 October 2011 (EDT)

are you asking for explicit relationship types? Having dwc:Taxon properties directly on a dwc:Occurrence record is very legal and intended in Simple DarwinCore. The notion of classes are not used in this occurrence definition, so identification and taxon properties are direct properties of the occurrence. I found over time that the granularity of model classes is rarely the same between application schemas, so Im actually happy that darwin core does not force me to use its class definitions and relations. But maybe I missed your point entirely...
PS: http://rs.gbif.org/core/dwc_occurrence.xml is only used for text based dwc archives, but the Simle DarwinCore XML schema is exactly the same in this regard.
PPS: if the question is about identifiers, there is of course a taxonID and identificationID that can be used to "normalize" the flattened structure if it was given (which is rarely is)

Meeting Notes

FilteredPush Team Meeting 2011 Oct 11

Present: Maureen, Lei, Bob, Bertram, Tim, Paul



TCN collaborations:

Things are moving along with the northeast herbarium group, haven't heard from the southwest arthropods group lately.

Basis for AppleCore:

What do we start with for the dwcfp model? We need to start with something that's meaningful for the mapper. The model used by gbif tools is a good first choice. What gets used in annotations and the standard representations-- xml, or spreadsheets... the semantic web stuff is not about back ends or mapping. The boundary between the semantic web tools and the mapper upstream of the mapper.

GBIF Occurrence Extension Schema-- created in support of GBIF's tools for harvesting and publishing with IPT. THere is a note from Markus that indicates it is only for text-based archives. (see above).


One of our key annotations is "new determination." Another likely one we can forsee is "new georeference." Then there is the arbitrary correction, e.g. the "Mangalia" case. From the consortium of new england herbaria there is the case of the new occurrence record, flat but relatively rich-- with an identfication, locality, collecting event data. Can we support all four of these cases with the terms in the schemas?

Yes, possibly at the cost of some rules that say "no more than one scientific name in any given annotation." Those rules should be fairly easy to compose. The alternative is that we add relations and remove the flat structure, which makes it not "simple" any more. Why not present the mapper with something flat, rather than require it to decompose something highly structured.

Note that nothing to do with ABCD appears in the schema, we should not forget about ABCD, and come back to revisit it later.

Maureen found this: http://code.google.com/p/darwincore/

Informal mapping of darwin core to ABCD: http://rs.tdwg.org/dwc/terms/history/dwctoabcd/index.htm

"Simple DWC" is not a schema but a set of principles to which a user should conform, for example, only using a term once per record.

http://rs.gbif.org/core/dwc_occurrence.xml is an xml file (all of gbif's extension files conform to this format) that names the properties present in a text file of darwin core records. This is a constraint file. You'd use jaxb to create Java classes for extension.xsd. You then read this file in-- using gbif-authored tools-- as an instance, and use it to guid creation of text files of darwin core records (?)

ApplePie may have harvesters that harvest text files that conform to the rules in dwc_occurrence.xml.

In the gbifprovider toolkit repository, there is evidence of the gbif tools that use something like dwc_occurrence.xml: https://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/java/org/gbif/ipt/model/Extension.java

IDCC paper: Bertram asked Paolo whether he could present our poster. He said he could do it (provided it's on the "right day").

Bertram's preferences: 1. Have FP present the poster (= Bob) 2. Consider Paolo Missier as backup presenter 3. Withdraw

ApplePie rules:

what kind of curation request will be expressed in an annotation? Anyone can feel free to add comments in the discussion page.