Scientific Name Validator

From Filtered Push Wiki
Jump to: navigation, search

Requirements

Given a list of scientific names (as name strings, in TDWG DarwinCore terms, with or without authorship, with or without higher taxa and nomenclatural code):

  • Return the list of names with validation flags, added GUIDs, added authorship strings, and provenance metadata.
  • If the name string lacks authorship, and authorship can be unambiguously determined, add an authorship string.
  • If the name string has an incorrect or misspelled authorship, and authorship can be unambiguously determined, add a corrected authorship string.
  • If the name string lacks a reference to a GUID in a nomenclatural authority, and a GUID can be unambiguously determined, add the GUID.
  • If the name string could mean more than one nomenclatural act attempt to determine which nomenclatural act is involved based upon the name string and any provided higher taxonomic information, if this can be determined, provide the GUID and authorship string from the nomenclatural authority, and provenance metadata indicating the homonym resolution.
  • If the name string could mean more than one nomeclatural act, and the a unique nomenclatural act cannot be determined from the information given (e.g. is a homonym without sufficient information to disambiguate the homonymy), flag the name as nomenclaturally ambiguous (SOLVE_WITH_MORE_DATA), and provide potential homonym matches in the provenance metadata.
  • If the name string is, with high confidence, a misspelling of a name string for which there is a GUID in a nomeclatural authority, propose the correction and provide the measure of confidence as provenance metadata for the proposed correction.
  • Users may request taxonomic resolution of the name strings.
  • If taxonomic resolution has been requested, provide a list of zero, one, or more than one current names (as strings and with GUIDs) which may, according to some taxonomic authority, be the current name to use for the provided name according to that taxonomic authority, along with provenance information about the authority and the names.
  • If population of higher classification has been requested, if a unique nomenclatural act can be resolved, fill in each DarwinDore term for Linnean ranks above Genus up to Kingdom which a value has not been provided in the initial data set (not producing annotations), with the choice of source for higher classification being by configuration (GBIF nub being a reasonable default).
  • If normalization of higher classification has been requested, if a unique nomenclatural act can be resolved, provide each DarwinCore term for Linnean ranks above Genus up to Kingdom (correcting values provided in the input data, not producing annotations), with the choice of source for higher classification being by configuration (GBIF nub being a reasonable default).

Two core use case scenarios (for more context see Scientific_Name_Validator#Detailed_Example_Scenarios).

  1. A data curator is validating the names that occur in their dataset for spelling and nomenclature. They wish to flag any name strings that can't be mapped to scientific names, add GUIDs that let them associate the name strings with entries in nomenclators (e.g. IPNI LSIDs), and fill in reasonable values for missing higher taxonomic ranks.
  2. A researcher wishes to normalize and quality control a data set (either provided by them, or provided by a query on harvested data in a FilteredPush network instance). They wish to obtain a name in current use for each name found in the data set, and flag names for which a unique name in current use can be obtained.

Implementation Plan

Steps

0. Internal Consistency

  • Compare the dwc:scientificName and dwc:scientificNameAuthorship with atomic terms: dwc:genus, dwc:subgenus, dwc:specificEpithet, dwc:infraspecficEpithet, dwc:taxonRank, dwc:verbatimTaxonRank, dwc:scientificNameAuthorship to test for inconsistency.
    • There are expected inconsistent cases: quadranomials and hybrids can't be represented with the available atomic terms.
  • Cases:
    • Example inconsistency with contradictory data: dwc:scientificName=Quercus alba L. dwc:genus=Solanum dwc:specificEpithet=aculeatissimum dwc:scientificNameAuthorship=Jacq.
    • Example mapping error: dwc:scientificName=Quercus alba L. dwc:genus=Quercus alba dwc:specificEpithet=L. dwc:scientificNameAuthorship=
    • Example subtle error in consistency: Nassarius persica Martens, 1874 - could be either Nassa persica Martens, 1874 or Nassarius persicus (Martens, 1874). Error could be Nassarius instead of Nassa (meaning the original combination), or persicus and addition of parenthesies to the authorship (meaning the changed combination).
    • Hypericum canadense L. × H. majus (A. Gray) Britton [hybrid]
      • scientificName: Hypericum canadense L. × H. majus (A. Gray) Britton
      • genus: Hypericum
      • species and scientificnameauthorship don't map cleanly, as ×, L., (A.Gray) Britton, majus, and canadense could be mapped in any of a multitude of ways into these two fields, e.g. species=majus authorship=(A. Gray) Britton.
    • Hypericum ×dissimulatum E.P.Bicknell [named hybrid] likely to be provided as genus=Hypericum; species=×dissimulatum; authorship=E.P.Bicknell
    • Salix candida Fluggé ex Willd. × S. petiolaris Sm. [hybrid]
      • scientificName: Salix candida Fluggé ex Willd. × S. petiolaris Sm. [or Salix candida Fluggé ex Willd. x S. petiolaris Sm.]
      • genus: Salix
      • species and authorship don't map cleanly.
      • Result: Inconsistent, but OK, further validation work from scientificName (atomic fields will be missleading).
    • Princeps aegeus aegeus forma beatrix (Donovan, 1805) [quadranomial]
      • scientificName: Princeps aegeus aegeus forma beatrix (Donovan, 1805)
      • taxonRank: forma
      • genus: Princeps
      • specificEpithet: aegeus
      • infraspecificEpithet: beatrix
      • scientificNameAuthorship" (Donovan, 1805)
      • No place to carry the subspecific epithet aegeus.
      • Result: Inconsistent, but OK, further validation work from scientificName (atomic fields will be missleading).
    • Euphilotes enoptes langstoni trans. dammersi (Boisduval, 1852) [quadranomial]

1. Misspelling

  • check against a name service which does name reconciliation (Global Names Index: http://gni.globalnames.org/); this would be the first action before going on to comparing against a taxon-based vocabulary.
  • check against a controlled taxon-based vocabulary
  • check against a controlled vocabulary which includes oddly constructed names (e.g. M. maclurea), or odd abbreviations for infraspecific ranks (e.g., nothovars., f./form/forma/formas)
  • No matches found, fail over to a the fuzzy match service at Global Names: http://resolver.globalnames.org/
if correct, continue; if curated, generate an annotation and continue; if can't curate, generate an annotation and exit
  • Cases:
    • Nassarius persica (Martens, 1874). Change of generic placement without change of suffix for gender agreement. Nassa persica Martens, 1874 now in Nassarius as Nassarius persicus (Martens, 1874)
    • Pachycondyla sinensis Emery misspelling of Pachycondyla chinensis Emery
      • Pachycondyla sinensis Emery: Error: Correction: Pachycondyla chinensis Emery, 1909
    • Philautus cruri Dutta, 1985, unavailable name, incorrect subsequent spelling of Philautus crnri Dutta, 1985. Name in current use: Indirana logicruis (Rao, 1937).

2. Author

  • Fill in missing author name, correct misspelled author name (if unique).
    • if author names are included in taxon name string they will need to be parsed and handled appropriately
use scientific name string parser (http://gni.globalnames.org/parsers/new) can validate authors against controlled vocabularies (e.g., Harvard Index of Botanists: http://kiki.huh.harvard.edu/databases/botanist_index.html ; http://www.ipni.org/ipni/authorsearchpage.do) and author abbreviations (e.g., http://www.ipni.org/ipni/authorsearchpage.do or publication: R. K. Brummitt & C. E. Powell, ed. (1992). Authors of Plant Names: a List of Authors of Scientific Names of Plants, with Recommended Standard Forms of their Names, Including Abbreviations. Royal Botanic Gardens, Kew. ISBN 0-947643-44-3.)
    • Correct author name if incorrectly formed and if unambiguous. For botanical names, use standard form for author name, for zoological names, add year of publication.
  • Cases
    • Nassarius albus (Say, 1826)
      • Nassarius albus Say: Error: Author corrected to: Nassarius albus (Say, 1826)
      • Nassarius albus Say 1826: Error: Author corrected to: Nassarius albus (Say, 1826)
      • Nassarius albus Say, 1826: Error: Author corrected to: Nassarius albus (Say, 1826)
      • Nassarius albus T. Say, 1826: Error: Author corrected to: Nassarius albus (Say, 1826)
      • Nassarius albus (Say): Incomplete information: Authorship corrected to: Nassarius albus (Say, 1826)
    • Nassarius albus Nowell-Usticke 1959 supspecific epithet missing, error for Nassarius albus nanus Nowell-Usticke 1959
      • Nassarius albus Nowell-Usticke 1959: Error: Did you mean Nassarius albus nanus Nowell-Usticke 1959 or Nassarius albus (Say, 1826)? (Internally inconsistent data, SOLVE_WITH_MORE_DATA).
if correct, continue; if curated, generate an annotation and continue

3. Homonyms (a genus name governed under one code of nomenclature is identical to another (in the same code, where one name is thus an error, or in two different codes, where the issue is simply one of disambiguation).

if correct, continue; if curated, generate an annotation and continue
  • Cases:
    • In the same genus (See also List from Catalog of Fishes on Taxacom)
      • Senior: Littorina patula Thorpe, 1844 Junior: Littorina patula Gould, 1849 [Primary Homonym]
        • Littorina patula: Solve with more data. Homonym exists. Can't tell which name this is.
        • Littorina patula Thorpe: OK.
        • Littorina patula Gould: Error: Junior Primary Homonym, replacement name is Littorina keenae Rosewater, 1978
      • Senior: Murex exiguus Broderip, 1833 Junior: Murex exiguus Kiener, 1842 [Primary Homonym]
        • Murex exiguus: Solve with more data. Homonym exists. Can't tell which name this is.
        • Murex exiguus Broderip: OK.
        • Murex exiguus Kiener: Error: Junior Primary Homonym, replacement name is Murex kieneri Reeve, 1845 (original combination), accepted name is Murexsul kieneri (Reeve, 1845).
      • Senior: Nassarius persicus (Martens, 1874) Junior: Nassarius persicus Cox, 1936 [Secondary Homonym]
        • Nassarius persicus: Solve with more data. Homonym exists. Can't tell which name this is.
        • Nassarius persicus (Martens, 1874): OK
        • Nassarius persicus Cox, 1936: Error: Junior Secondary Homonym, accepted name is Nassarius delicatus (A. Adams, 1852)
      • Senior: Glycymeris conradi (Whitfield, 1885) Junior: Glycymeris conradi Dall, 1909 [Secondary Homonym]
      • (Conserved Name): Buprestis arcuata Laporte & Gory, 1838 (Suppressed Name): Buprestis arcuata Say, 1825 See: http://www.biodiversitylibrary.org/part/43080#/summary
      • Earlier: Astragalus rhizanthus Royle (1835) Later: Astragalus rhizanthus Boiss. (1843)
    • Philautus montanus Rao, 1937 Junior objective synonym by neotypification of Philautus flaviventris (Boulenger, 1883) and Junior primary homonym of Philautus montanus Taylor, 1920. See Bossuyt and Dubois, 2001, Zeylanica 1:1-112 p.51 Given the nature as both objective synonym of P. flaviventris, and the recombination into Icalus montanus, quite possible that specimens with this name would require examination to determine correct current identification, except that Bossuyt and Dubois deliberatedly proposed the neotype as the same specimen ans is the lectotype of Philautus flaviventris (Boulenger, 1882), thus name in current use is Philautus flaviventris (Boulenger, 1882).
    • Icalus montanus (Rao, 1937) Invalidly proposed junior secondary homonym of Icalus montanus Gunther, 1876, invalidly proposed as basionym Philautus montanus Rao 1937 is a junior primary homonym. See Bossuyt and Dubois, 2001, Zeylanica 1:1-112 p.51
    • Microdon rugosus var. fuscus Bezzi, 1921 junior primary homonym of Microdon fuscus Meijere, 1908.  ?No replacement name, homonymy recognized in synonymy? Name in current use is Metadon rugosus (Bezzi, 1915).
    • Metadon fuscus Bezzi, 1921 ?invalid combination introduced in synonymy in Reemer and Stahls, 2013? See: doi: 10.3897/zookeys.288.4095 p.134
    • Crotaphytus fasciatus Mocquard, 1899 junior primary homonym of

Croataphytus fasciatus Hallowell, 1853. Replacement name Croataphyts fasciolatus Mocquard, 1903, name in current use is Crotaphytus vestigium Smith and Tanner, 1972.

    • In the same genus, by the same author
      • Pachycondyla chinensis (Emery, 1909), Ponera solitaria F. Smith, 1874 non Ponera solitaria F. Smith, 1860. See: http://eol.org/pages/485763/overview and page 247 in McKay and McKay 2010
        • Ponera solitaria Solve with more data. Homonym exists. Can't tell which name this is.
        • Ponera solitaria Smith Solve with more data. Homonym exists. Can't tell which name this is.
        • Ponera solitaria F. Smith Solve with more data. Homonym exists. Can't tell which name this is.
        • Ponera solitaria F. Smith, 1874: Error: Junior Primary Homonym. Replacement name is Brachyponera chinensis Brown, 1958. Accepted name is Pachycondyla chinensis (Emery, 1909).
      • Mantispilla formosana var. major Stitz, 1913 and Mantispilla punctata var. major Stitz, 1913 see: Taxacom
      • Pedicularis inconspicua P.C. Tsoong, 1955 (January) in Acta Phytotax. Sin. (Senior primary Homonym) and Pedicularis inconspicua P.C. Tsoong, 1955 (November) in Bull. Brit.Mus.(Nat.Hist.) (Junior primary Homonym) and Pedicularis inconspicua Vved., 1955 (June) (Junior primary Homonym). See: Taxacom
      • Puncturella hendersoni Dall, 1927 published in Proc. US National Museum 70(2667): 111 and Puncturella hendersoni Dall, 1927: Proc. US National Museum 70(2668): 9. See: Taxacom
      • Leucophlebia lineata formosana Clark, 1936, Proc. New Engl. Zool. Club 15:76 and Leucophlebia lineata formosana Clark, 1936, Proc. New Engl. Zool. Club 15:86
    • In the same genus, by the same author, in the same publication:
      • Tanypus nebulosus Meigen, 1804: 21 and Tanypus nebulosus Meigen, 1804: 23 (Meigen, 1804) See: Taxacom See: p. 14 in Spies and Saether, 2004
        • Tanypus nebulosus Solve with more data. Homonym exists. Can't tell which name this is.
        • Tanypus nebulosus Meigen Solve with more data. Homonym exists. Can't tell which name this is.
        • Tanypus nebulosus Meigen, 1804 Solve with more data. Homonym exists. Can't tell which name this is.
        • Tanypus nebulosus Meigen, 1804 p.23 Error: Junior primary homynym. Name in current use is Natarsia punctata
        • Tanypus nebulosus Meigen, 1804 p.21 OK. (Senior primary homynym). Name in current use is Macropelopia nebulosa (Meigen).

(Fabricius).

      • Zonites verticillus var. graeca Kobelt, 1876 p.48 and Zonites albanicus var. graeca Kobelt, 1876 p.48 (same author, same year, same publication, same page).
      • Clausilia oertzeni Boettger 1889 p.42 and Clausilia schuchi var. oertzeni Boettger 1889 p.52
      • Noctua marginata Fabricius 1775 p.597 and Noctua marginata Fabricius 1775 p.610
      • Phaenopria angulifera Ashmead, 1895 p.810 and Phaenopria angulifera Ashmead, 1895 p.810 (same author, same year, same publication, same page).


    • In different genera under the same code:
      • Earlier: Scenedesmus armatus f. brevicaudatus L.S. Peterfi [1963] Later: Scenedesmus armatus var. brevicaudatus (Hortob.) Pankow [1981] (Note diff. ranks see ICNAFP Art. 53.4)
      • Paramixogaster vespiformis (de Meijere 1908) junior secondary homonym of Paramixogaster vespiformis (Brunetti 1913), replacement name Paramixogaster brunetti Reemer & Stahls, 2013. See: doi: 10.3897/zookeys.288.4095 p.59.


    • Generic Homonym affecting a species name
      • Toxeuma belone Chun, 1906 [Cephalopod] is invalid. Genus Toxeuma Chun, 1906 [Cephalopod] is a junior homonym of Toxeuma Walker, 1833 [Insect].
    • Hemihomonyms (Same name, but in different codes. Valid, but potential source of confusion). See list of generic hemihomonyms in Shipunov, 2011. See: Wikispecies List of species level hemihomonyms.
      • Agathis montana Shestakov, 1932 and Agathis montana de Laubenfels
      • Asterina gibbosa (Pennant, 1777) and Asterina gibbosa Gaillard
      • Baileya australis (Grote, 1881) and Baileya australis Rydb.
      • Centropogon australis (White, 1790) and Centropogon australis Gleason

4. Synonymy (the genus-species combination given is not the currently accepted name [JAM: This is not necessarily about QC but is a valuable service in providing additional information to the user about upates to the taxonomy of the records they provided; could be seen as an automated new determination from a given source])

  • search against a controlled vocabs
  • Cases:
    • Philautus crnri Dutta, 1985 unavailable name, invalidly proposed replacement name. Name in current use: Indirana longicrus (Rao, 1937).
if correct, continue; if curated, generate an annotation and continue


5. Genus/species match

  • Probably means gender match, thus a component of (1) above - check for alternative gender endings of specific epithets.
  • If so, same case as (1) above:
    • Nassarius persica (Martens, 1874). Change of generic placement without change of suffix for gender agreement. Nassa persica Martens, 1874 now in Nassarius as Nassarius persicus (Martens, 1874)
  • use controlled vocabularies and services above to verify a known match for genus and species (note that you may need or want to cross reference against several of these to insure better coverage of combinations)
if correct, continue; if curated, generate an annotation and continue; if can't curate, generate an annotation and exit


GUID from nomenclatural authority [position in flow unclear]

  • Lookup the GUID provided by a nomenclatural authority for the nomenclatural act that formed the name (e.g., International Plant Name Index (IPNI), ZooBank)
  • available term for the GUID for the nomenclatural authority: dwc:scientificNameID
  • available term for the GUID for a provided name in current use if added in (4) dwc:acceptedNameUsageID

Presentation Jul 26

July26presentation.pdf

Flowchart

Flow chart design of Advanced SciNameValidator


Description of each step

Consistency check

The idea is that scientific name can be constructed from other dwc fields in the record other than dwc:ScientificName, if one record has this "redundancy", then we need to check whether the two copies of name are consistent or not.

First, we parse the dwc:ScientificName with GBIF name parser, the possible components are: Genus, SpecificEpithet, Subgenus, TaxonRank, VerbatimTaxonRank, InfraspecificEpithet etc. Then, we compare those components to the dwc field in the record and generate a consensus value with the following rule:

if one of the copy is null, use another copy as consensus value (if both are empty, then the value is automatically empty)
if both of the copies are not null, we use simple string comparison: 
:if they are the same, return as consensus one, 
:if they are different, return null and generate annotation "UNABLE_TO_CURATE.

After we have a list of consensus fields, construct the scientific name with following rule:

trim (
 trim (
 genus |
  if (subgenus is not null) then ' (' | subgenus | ') '  else ' ' endif
 specific epithet | ' ' |
   if (verbatimTaxonRank is not null) then verbatimTaxonRank | ' '
       elseif (taxonRank is not null) then taxonRank | ' '
       endif
 infraspecificEpithet
 ) | ' ' | scientificNameAuthorship
)

It's possible that the inconsistency is caused by misspelling in one the copy, then a misspelling check before comparison will be useful, please see the "documentation" for more details (currently is not implemented).

Misspelling check

We adopted the global name resolver API (http://resolver.globalnames.org/) to solve misspelling, we use three of the returned result fields: name string, confident score and match type.

Field Meaning
Name string the resolved name string
confident range from 0~1 indicating how confident is the correction
match type two major types: exactly matched or fuzzy matched

We query the global name resolver API with consensus scientific name, if the first result returned has a match type "exactly matched" then it's a not misspelled case. If the type is "fuzzy matched", then the name has spelling issue, and if the score of the match is high (>0.5), then we return the corrected name with comment "misspelling corrected"; if the score is low, then we return null and generate annotation "UNABLE_TO_CURATE".

GBIF checklist bank query

Currently, the checklist bank name usage API (http://dev.gbif.org/wiki/display/POR/Webservice+API#WebserviceAPI-ChecklistBankServices:Nameusage) only support query with canonical name (without authorship), we'll compare the the authorship with returning results later.

The returned results is a list of name usage class. A name usage is a json object containing comprehensive information about a scientific name from one particular data set (checklist). If no result is returned, we will query the scientific name against GNI; if the returned result has exactly one name usage for each checklist (i.e. each name usage has different data set key), then it's a good result, continue; if there are more than two name usages with same data set key, then we know it's a homonyms case, we'll deal with that sepeartely.

Homonyms check

There are couple of ways to distinguish homonyms, one is using higher taxon. If there are two name usages in the results with same name (in the same checklist), then we can use the higher taxon in the original record to see whether it can distinguish the two name usage. For example, one name usage has a kingdom field with value "Animalia", the other one has value "plantae", then if the original record has a kingdom value, then we can use that to distinguish one from each other.

other ways to be continued

Query against GNI service

Use existing GNI service to see whether we can a match in GNI.

Resolve synonyms

In a name usage returned by GBIF checklist bank, there is a boolean field called "Synonyms". If the value is "no", then there is no synonyms for this name, the name is accepted, we'll use the value with label "Name" as a result scientific name. If the value is "yes", then the input scientific name is a synonyms of other name, so it's not used any more and there is a accepted name to replace the old name, then we use the name with label "AcceptedName" instead.

Pick one result from the results

GBIF checklist bank returns a list of name usages, some are from authoritative nomenclators (e.g. IPNI, ZooBank), some are not. So we'll pick the one from a more authoritative nomenclator, we're constructing a list of the checklists with authoritative score.

Documentation and notes

Inconsistency and misspelling

Inconsistency within each record mean the dwc:ScientificName is different than the constructed name from atomic parts (e.g. genus, author etc.), we dealt different cases as following:

If the two names are consistent, continue with checking misspelling.
If not consistent, check misspelling of both names, and
    if one name is exact match, another one is fuzzy matched with high score and 
        if the corrected name string is the same as the first one, take the first name and continue.
        otherwise, return unable_to_curate and stop
    if both names are fuzzy matched with high score, and
        if both corrected names string are the same, take the corrected name and continue.
        otherwise, return unable_to_curate and stop
    if both names are exact match, return unable_to_curate and stop

checklist bank and homonyms

There are different datasets in GBIF checklist bank, we can specify the dataset according to the information in the record and get better result. If we can't decide which dataset to use, we will all the data sets. Then we need to deal with the problem of all of the same results in data sets are returned and it's difficult to detect homonyms.

Detailed Example Scenarios

(1) For example, consider an occurrence of the text string "Pilophorus clavatus" in a biodiversity dataset. A call on a comprehensive taxon name or a series of taxon name services will return both the insect Pilophorus clavatus (Linnaeus, 1767) urn:lsid:faunaeur.org:taxname:450933 and the lichen Pilophorus clavatus Th. Fr. urn:lsid:indexfungorum.org:names:476221. A workflow process that evaluates taxon name strings found in biodiversity data must be able to appropriately handle this and more subtle cases of homonyms. If other data elements indicate that the record in question is a lichen, then a wrapper for name validation services can return the lichen name and authorship, if not, then the wrapper must return return an indication that the result it obtained was ambiguous, and may need human intervention to resolve.

In this particular case, the human intervention is probably nothing more than the question "did you mean the insect or the lichen?" This is a question that should be targeted at the person cleaning the data (who is aware of the context), rather than to an outside taxonomic expert. The data presented for cleaning may have included no authorship string (as in the example above), or an orthographic variant, or a slightly incorrect authorship string (each of which would suggest different handling) no authorship and no taxonomic context raises the question of which organism is meant, an orthographic variant or slightly incorrect (or non-standard in botany) authorship string, could for most purposes simply be corrected (with provenance tracking of how the correction was made).

Workflow code that is invoking a taxon name service must be able to handle (and appropriately annotate the data elements in the workflow pipeline) service responses describing cases of no matching results, a single match, a set of matches all describing the same nomenclatural act (e.g., more than one data source referencing Pilophorus clavatus (Linnaeus, 1767) with different GUIDs for their data records), and a set of matches describing more than one nomenclatural act, such as a response of a match on Pilophorus clavatus (Linnaeus, 1767) and on Pilophorus clavatus Th. Fr.

(2) Now, consider a slightly more subtle case, a natural science collection record asserting an identification of a collection object as Trivia affinis, as in the case of MV P 9565 (Museum Victoria, Collection Object: P 9565 Trivia affinis Littorinimorpha : Triviidae, France). Invocation of various taxonomic web services with the name string "Trivia affinis" will return Trivia affinis Marrat, 1867 urn:lsid:marinespecies.org:taxname:558373 currently known as Purpurcapsula corinneae (Shaw, 1909) urn:lsid:marinespecies.org:taxname:555483. This is because, among other reasons, Trivia affinis Marrat, 1867 is a secondary homonym of Trivia affinis Dujardin, 1847. Purpurcapsula corinneae is an extant Indo-Pacific Trivia, while Trivia affinis Dujardin, is a Tertiary fossil European Trivia. Within simply the taxonomic context, there is ambiguity, the question, "Did you mean the Trivia, or the Trivia?" won't help. Somewhere else in the pipeline, the context, fossil from Europe, or extant Indo-Pacific needs to be brought in to the resolve the possibilities retrieved by the name service. Going one step further, consider a workflow that is examining transcribed label data from a collection of extant Indo-Pacific mollusks. The workflow, on encountering "Trivia affinis Msttsy" is quite justified in asserting a reasonable match to urn:lsid:marinespecies.org:taxname:558373 and replacing the text string with Trivia affinis Marrat, 1867, but if the workflow is simply checking the transcription, it is not justified in correcting this identification to the current name Purpurcapsula corinneae, as this would be making a false assertion about what the person who made the identification put on the label.

Conversely, if the purpose of the workflow is to prepare the data for scientific analysis of organism occurrences, the workflow would be entirely justified in replacing the name with the current identification Purpurcapsula corinneae (Shaw, 1909), allowing subsequent clustering of occurrence of this species. Context (both of the data elements and of the purpose of the workflow) is critical, and having workflow agents that are able to be reasonably configured to handle different contexts of workflow purpose, and to seek out data context is necessary to handle the complexities of the domain.