Informal description of atomic messages: One message = one purpose!!!
Implementation description of atomic messages: One message type = one concrete job.
What is "message originator" (person? client?) Perhaps both, thus Header element <xsd:element name="Originator"/> as a complex type consisting of a GUID for the node that originates a message and a string that represents an (untrusted) assertion by the client software of the name of the person who is logged in and using the client to generate the message.
--David Lowery (talk) 18:35, 27 February 2013 (CET) Message originator in the current implementation is an authorized client. Currently we have private/public key pairs that correspond to each authorized client and the only metadata associated with them is an alias such as "symbiota-scan". The message originator has a stub implementation in org.filteredpush.identity.ClientIdentity.java, is currently not being used, and probably needs to associate more metadata with the client's public key used for authentication (currently the network only knows that someone who made a request was authorized to do something, but information about who that someone was is not being retained)
Multiple clients at same node? Probably easier than not
--David Lowery (talk) 18:35, 27 February 2013 (CET) Multiple clients can currently authenticate with the same network node (via access point) and the key from the xml dsig is compared against all certificates of authenticated clients in a PKS12 keystore (PKS12 was chosen in favor of the Java keystore format to maintain cross-compatibility with other implementations, i.e. SparqlPuSH which is written in PHP and can use the ssl libraries to obtain keys from the PKS12 store)
Atomicity of Message filtering? All or none for now.
To fit Requirement14 FP Messages can be encrypted XML messages.
Requirement15 is met through client software signing FP Messages with XML dsigs, and the FP AccessPoint maintaining a list of authorized client public keys.
- Network object ID (or list of same[?])
Tradeoff - send message multiple times, or send large complex messages. Implementation level decision?
--David Lowery (talk) 19:05, 27 February 2013 (CET) In the current implementation each message has a unique "messageId" field that contains a UUID. This messageId is also passed back in the response to the client. The messageId can be used by the client as a handle to obtain the results that are related to the request initiated by the message (i.e. query results, analysis results, annotations meeting interest). Internally, the messaging system can attach other response messages to the original identified by this handle by setting the inResponseTo field of any other messages created to store results for retrieval to the UUID identifying the original message. A call to check messages by the client with the original handle with return all messages "inResponseTo" or associated with the original message.
- Operation ID
What kind of message is this (FP_TAXON_COMMENT). Extensible and schema based? Portal to community is another edge where messages apply - privileged on one side, not privileged on the other (relevance of xml access controls).
--David Lowery (talk) 19:05, 27 February 2013 (CET) In the current implementation this is the type field of an FPMessage. The representation of a message type is currently extensible (via class hierarchies) but is not schema based. One idea proposed is to use xsd enumerations to represent valid types and to extend these types using xsd union (for example the union of the base fp message type enumerations and the apple pie message type enumerations describes the set of all valid message types in use by the system configured by that schema).
- Originator of Message
- person Requirement12
- Message Signature
- Signature by the client code that generated the message, Public key available in the network to validate signature and source of message as known client. Requirement15
- Destination of message
FilteredPush messages do not have explicit destinations other than the FilteredPush access point to which they are submitted.
--David Lowery (talk) 19:05, 27 February 2013 (CET) The destination of a message is determined by the jobplanner/jobrunner mapping the message type to the appropriate job implementation classes. A message is then associated with the job which contains the code for interpreting the message content according to the scheme property of the message. After extracting message content the job invokes the messaging system and will store the message metadata and content in the current MySQL implementation of a messaging database. The schema for the MySQL messagestore is defined by key value pairs associated with messageids. Current plans are to implement the messaging using MongoDB where content that can be structured as kvp (such as message type, scheme, etc and other message metadata) will be stored using the kvp store capabilities of mongo and other content (such as the message content xml) will be stored as binary data using the filesystem store (gridfs) capabilities of mongo.
It is the responsibility of the FilteredPush network to match messages to interests/destinations.
Are there messages where the destination is other than broadcast on network: No.
Message destination other than access point may be irrelevant as this level of message handling is delegated to the messaging system.
--David Lowery (talk) 19:05, 27 February 2013 (CET) The services provided by the network must be invoked via the access point which authenticates the message. As a result all messages must go through the access point as the single point of entry.
- Message content - still needs elucidation.
--David Lowery (talk) 19:05, 27 February 2013 (CET) Current implementation is Java based and content is a String field of FPMessage (maps to xsd string when deserialized as xml). The proposed schema based implementation with content element typed as xs:anyType as opposed to CDATA would allow for more structured message content as xml in addition to simple strings of character data such as kvp.
- See message types below
Is a Set immutable? One app is stuff that is seemingly a dup, but isn't, or isn't but something determines later that it is.
Are sets non-intersecting?
What are relations between Sets.
How is a Set defined? Hopefully some mix of automatic and people-originated. Replace "Set" with "Cataloged (Virtual)Container" ?
Other General Comments
Sites as collections
Primitive is cataloged collection object?
Composite objects: what investigates passing messages down to composite pieces?
"Cataloged" = "GUIDable"?
Value to an FP network of "I want to know if these investigators ever begin working on these network objects (or objects defined by these properties). Also, "What are the stuff other people say the above is met; subscribe me to those")
Need authentication of agents (people or software), whether or not through a node, perhaps only through a portal.
- 1 General Concepts
- 2 Transport
- 3 Message arguments
- 4 Sets
- 5 Other General Comments
- 6 Potential Client to Network Messages Listed in Use Cases
- 7 Potential Network to Client Messages listed in Use Cases
- 8 Messages
- 9 Schemes
- 10 Potential Messages
- 10.1 General
- 10.2 Queries
- 10.3 Set Operators
- 10.4 Community Messages
- 11 Next to do
- 12 XML Schema
Potential Client to Network Messages Listed in Use Cases
Use Case Find Duplicates
- makeAssertion ANNOTATION
Use Case AnnotateSpecimen
- makeAnnotation ANNOTATION
- makeAssertion ANNOTATION
- expressInterest ANNOTATION
Use Case Quality Control New Record
- makeAnnotation ANNOTATION
- (also) annotateWorkflow ANNOTATION
- makeAnnotation ANNOTATION
- addToSet ANNOTATION
- makeQuery See: Potential Query Scenarios
- makeAssertion ANNOTATION
- reportAccessPointStatus PING
Use Case Researcher_Seeks_DwC_Metadata
- queryForData (DarwinCore Metadata)
- subscribe REGISTER_INTEREST
Potential Network to Client Messages listed in Use Cases
The Network_Monitoring_Use_Cases could also call for event notification of administrative clients.
The following message types are currently defined and have implementations of jobs unless stated otherwise.
FP Client wishes to know if a FP Access point is listening. Access point returns a MessageID and takes no further action.
FP Client wishes to know status of resources in FP Network Instance. Message instantiates a job that checks and reports on knowledge/messaging/analysis capabilities.
FP Client wishes to execute a query against knowledge. Instantiates the Query job which will launch the appropriate query depending on message scheme (i.e. SPARQL, KVP ...)
Used internally to signal an error. Currently no implementation for this type but would probably be attached to the original message that the client has a handle for and would be returned to the client after the call to check for messages.
Wraps a OA/OAD annotation.
The semantics of what is being annotated is delegated to the annotation.
Annotation typing is delegated to rules: See ApplePieRules for typing of annotations.
Annotation messages were typed in the Prototype, this entangles the concerns of the domain with the concerns of the annotation with the concerns of the transport layer. We do not advocate the structure used in PrototypeTypedAnnotations.
Semantics: originator registers interest in <something>
Originator would like to remove a previously registered interest from the system and stop receiving notifications on it. This is currently not implemented.
Deals with the find duplicates use case.
Run an analysis using the available analysis engine on the data and the named workflow provided as parameters in the message.
--David Lowery (talk) 19:21, 27 February 2013 (CET) We are currently working on an implementation of analysis using Kepler. As of right now triage will plan and run the analysis job which invokes kepler wrapped in an ejb and this ejb will print out a message to the logs "Start kepler!"
Currently not implemented. Enables a client to add to the set of available workflows.
Currently no schemes defined in this class.
The RDF/XML scheme describes the message content of an annotation message type.
The SPARQL scheme describes the message content of a query message.
The KVP scheme describes the message content of a query message or an interest message.
The PROCEDURE scheme describes the message content of a query message that refers to a stored procedure.
--David Lowery (talk) 22:09, 27 February 2013 (CET) The annotation processor currently uses this stored procedure message type to run canned queries stored in the network (or queries that require multiple steps). One example is the getAnnotations(annotationid) query or getResponses(originalAnnotationId). Stored procedures have message content of the form <procedureName>(<procedureArgs>).
Note: These retain substantial bagage from the prototype and need to be reworked.
- An asynchronous message has a reply waiting for you (network notifies client)
- one of your subscriptions has a new publication
- I am a data provider with new data available to the network (client notifies network)
- Network has a new subscription that people might be interested in (network broadcasts?)(what about authorization?)
Semantics: A data provider is indicating that data they have available for query or harvest has changed.
Generalization is FP_Messages#FP_ANNOTATION
args: (true, false, accept, not-accept). Semantics: originator is asserting that something is true (and thus accepting it), false (and thus rejecting it), or accepting (or rejecting) it without agreeing or disagreeing with its validity. Fourth case of not-accept emerged in discussions at TDWG 2008, including with Mark Mayfield who indicated a desire to not accept some subset of new determinations that might be true but which reflected a new combination that their institution might not want to store in their database or record as an annotation on the specimen. The value not-accept is essentially a formal mechanism for ignoring the message (possibly distinguishing institutions that review incomming annotations from those that ignore them). Examples: James accepts all annotations made by Tony. James says that this determination made by Anne is correct. James says that this determination made by Henry is incorrect.
General, or specific subtypes (inventory, find sets, get data)?
Semantics: How many sets do you know about with property X?
Semantics: Which sets do you know about with property X?
Given a set, retrieve all associated data.
These look like they can be generalized, and may be two sorts of operation with some (add/remove) being expressed as annotations, and others (build sets/add generation rule) be analysis instructions.
Message: FP_ADD_SHEET args: SetID; SpecimenID Semantics: message originator is asserting a specimen belongs in the given Set (Is this an annotation, or should there be a way to enforce it?)
Semantics: reverse of FP_ADD_SHEET
Semantics: message originator is describing a novel set of rules for creating sets and determining set membership (e.g. a new rule for building sets of collection objects that have determinations within the same taxonomic concept).
Semantics: given a rule and optionally limiting criteria, build sets with that rule (find all sets of duplicate specimens of Rubus, find all sets of duplicate specimens known to the network).
Need further discussion and elucidation.
Semantics: Notification that work is in progress on a network-identifiable object. Up to client to interpret what to do with the object, e.g. if it is decomposable at the client side. Does this entail producing new, transient(?) identifiable objects?
- Mark as Work In Progress
- Release Work In Progress
- Query for Work In Progress
- Inventory Work In Progress
Next to do
Spell out specifics of these messages. Link to use cases.
For current authoritative representation see: https://sourceforge.net/p/filteredpush/svn/HEAD/tree/trunk/FP-JavaSOA/FP-Modules/FP-Core/src/main/resources/fpmessage.xsd?format=raw