From FilteredPush
Jump to: navigation, search

Etherpad for meeting notes: http://firuta.huh.harvard.edu:9000/FP-2011Sep06


  • Progress on TCN
  • TDWG Demo
  • Progress on AOD
  • Charter for the Mapper
  • Architecture Implementation


  • Paul
    • TDWG meeting system reports that Annotations Interest Group working session is scheduled for sometime on the afternoon of Monday Oct 17.
    • Met with Bob and Maureen, reviewed initial document from southwest soil arthropod TCN, framed additional questions for them.
  • Lei
    • Studied the ontology technology and tool and looked though the AO and examples
  • James
    • Successfully submitted TDWG abstract for Annotation Symposium
    • Framing a Filtered Push component for a proposal to Genomic Research and Development Initiative (GRDI). Will tentatively include an implementation of the network and development of an API to BOLD. There is also an interesting possibility of annotating metagenomic data at the level of the relationship to specimens and taxonomy. We are also considering using Kepler workflows.


FilteredPush team meeting 2011 Sept 06

Present: Bertram, Tim, Lei, Bob, Maureen, Paul, Jim, James


  • Progress on TCN
  • TDWG Demo
  • Progress on AOD
  • Charter for the Mapper
  • Architecture Implementation


  • Progress on TCN

Reviewed document from project, framed additional set of questions.

Bob: We've identified a list of FP work and project platform work. Probably involve a full time person to support their proposal, paid for by them, 3/4 time working on platform, and 1/4 on workflow tools.

We need to clarify which responsibilities are whose.

Will there be travel imposed on Kepler people by this?

Where would the person be? Probably Harvard, but travel would be needed to the southwest.

They could also budget to have someone come here, but it might be more useful for someone here to go out there. They should come to our annual meetings.

When is the deadline for submission? October 31, but they need a week lead time. Also, TDWG's meeting happens then.

Does the "other side" know that FP is requesting an FTE? Not yet. Maybe they already have allocated resources?

The work involved would be: the person would set up a data harvesting/portal system, then would create an instance of a FP network that would integrate with that system, where they could display specimens that need identifications, collect identifications, and send them back to source databases.

TDWG symposium session: our group will give two talks, an introduction and fp as an example; a closing talk. Do we have a day yet? Annotations Working Group Monday afternoon, so symposium should follow.

  • TDWG Demo

Maureen: process set up for nodes to boot up and join network. Should be running this week. Subscribe to queries, perform queries, advertise data sources, and find out who can service those needs.

  • Progress on AOD

Bob: http://etaxonomy.org/mw/Extensions_to_AO The easier diagrams and pointers should be later tonight. That link is two sets of three approaches each to certain problems that are interesting because of underspecifications in AO and AOD. THey are more complicated than what Tim needs to extract data from, so later tonight there will be simpler ones in diagrammatic form.

This also relates to work Paul and Bob have been doing on a manuscript.

Clean up on the wiki-- there's a number of pages on AOD that should be condensed. Bob is working on organizing links to those pages. Current entry point is: http://etaxonomy.org/mw/AOD_Extension_of_AO_for_Data

Lei has some questions about the examples which she will write up and send to Paul.

When should Paul, Lei, and Bob get together to talk about the examples? On the Friday agenda. Emails are welcome before then too.

  • Charter for the Mapper

We discussed this on Friday at the tech meeting. Who will work on defining DWC layer? In other words, at which level does the mapper start? rs.tdwg.org/voc is what is used in the examples; they have most of what DWC is represented there. There is a representation in RDF of DWC which are considered official but are not in wide use. They also don't go cleanly into OWL, but they don't try to be ontology definitions, and we'd only use them as predicates.

Framing annotations such as "new determination:" "new determination" isn't DWC but we can express it as DWC. This is a layer that falls between annotation and DWC-- domain-specific standards. THat's probably something for us to start defining and perhaps the TDWG annotations interest group.

This parallels AppleCore. It would be possible to include in AppleCore documentation use of DWC elements for annotations. This would be logical to do after TDWG. What are the kinds of assertions that people are calling annotations?

The mapper could be split into an upper layer and a lower layer. What form does the information flow between the two take? If it is the ontology language, that would mean there is no upper layer. Maybe we just need a convention internally, rather than a standard.

Bob says there are two approaches published. One: Roger Hyam's "The TDWG Ontologies" which have a nice structure as an ontology and easy to use with Protege. The other: there is a thing called "The RDF Form of DWC." It is a part of the DWC standards. Few people are actually using it. Maybe some of the people using linked open data. We may have enough stability in our examples to test whether we could use this second option in Protege. Bob and Paul can do that and send mail on the results. The Hyam ones are more robust, and the "official" ones are more "official."

In either of those approaches, can you say "here is a complete DWC record," and here are some alternative values such that you can say "and here are the deltas."? Nothing in the documentation says anything about what constitutes a "complete" DWC record. Communities are encouraged to submit subsets of the vocabulary that are important to them, and AppleCore should consider doing so.

Another point is that there is a specification called "Simple DWC," which is meant to be something that is guaranteed to be flat and to use any/all of the DWC terms. If you are compliant with Simple DWC, there are few restrictions. The mapper would be required to map all the terms if it is Simple DWC compliant.

To rephrase the question: we want to represent a dwc record, terms and values, maybe not flat. We also want to represent that we also have alternative values or proposed changes. Our annotation ontology doesn't describe the record so much as changes to a record. Is there a way of representing that?

The annotation ontology itself does not care, it only specifies where to find the answers to that question in an annotation. Is there some thing that would be called "an implementation of DWC" (such as Simple DWC: http://rs.tdwg.org/dwc/terms/simple/index.htm) that would allow a set of things that's good enough for us? The answer to that is "no."

What Tim needs to proceed is a patch format.

Bob proposed using AppleCore [link?], if it is far enough along.

Maureen: http://www.talis.com/tdn/platform/reference/schemas/changesets ?

Tim and Lei think the question might be moot if the upper layer interacts with the user and lets the user describe the mapping.

At what point if any do we support mapping when there are redundant ways in the underlying vocabulary to specify things? How do we resolve ambiguous/confliting data?

We can put another item on the Friday agenda for sketching out the flow of annotations through the mapper from the network to the local database.

  • Architecture Implementation