Difference between revisions of "IDs and LOD Discussion"
Line 7: | Line 7: | ||
== Improving links to external resources == | == Improving links to external resources == | ||
+ | |||
+ | === Linking to people === | ||
[BGBM]: we started to discuss how to enrich our (rdf) metadata and concluded that we will start with looking closer at collectors. Our first step will be to export collector names and collector IDs from our herbarium management system (JACQ) and sort them by frequency of use. We will then setup a spreadsheet with columns for ... | [BGBM]: we started to discuss how to enrich our (rdf) metadata and concluded that we will start with looking closer at collectors. Our first step will be to export collector names and collector IDs from our herbarium management system (JACQ) and sort them by frequency of use. We will then setup a spreadsheet with columns for ... | ||
Line 23: | Line 25: | ||
The same workflow could be implemented for geographic features. | The same workflow could be implemented for geographic features. | ||
+ | |||
+ | === Linking to institutions === | ||
+ | |||
+ | The stable identifiers include a domain name that is the property of the institution responsible for the specimens. So the first Identifier workshops expresed no need to link to the institution. But as this kind of "implicit link" is not usable by machine on the Web of data, it could be worth adding. | ||
+ | |||
+ | The current version of CSPP link only specimen to a web page that could be an instutional website, maybe adding http://rs.tdwg.org/dwc/terms/#institutionID attribute could be useful. | ||
+ | |||
+ | For the values of those attribute we have two candidates : | ||
+ | |||
+ | ==== 1)http://grbio.org/ ==== | ||
+ | |||
+ | http://biocol.org/urn:lsid:biocol.org:col:34988 for MNHN Paris | ||
+ | http://biocol.org/urn:lsid:biocol.org:col:15605 for Botanic Garden Meise | ||
+ | |||
+ | Pro : community managed CETAF and TDWG related | ||
+ | Con : URI not resolved and seems there is no data attached | ||
+ | |||
+ | ==== https://www.wikidata.org/ ===== | ||
+ | |||
+ | https://www.wikidata.org/wiki/Q838691 for MNHN Paris | ||
+ | https://www.wikidata.org/wiki/Q3052500 for Botanic Garden Meise | ||
+ | |||
+ | Pro : many related information, URI resolved (Html only ?) | ||
+ | |||
== CETAF collection data index == | == CETAF collection data index == |
Revision as of 14:34, 18 October 2017
In 2016 and 2017, the ISTC decided that improving LOD capabilities of CETAF Stable Identifiers for collection objects should become a priority. This involves primarily
- activities for improving links from collection metadata to external resources and concepts and
- implementation of a working CETAF collection data index prototype as a basis for advanced inference mechanisms.
Ideas, discussions, and outcomes linked to these targets will be documented on this page. Please feel free to add your thoughts / comments / results below. More information about the CETAF identifier initiative is available on the main wikipage.
Contents
Improving links to external resources
Linking to people
[BGBM]: we started to discuss how to enrich our (rdf) metadata and concluded that we will start with looking closer at collectors. Our first step will be to export collector names and collector IDs from our herbarium management system (JACQ) and sort them by frequency of use. We will then setup a spreadsheet with columns for ...
- collector name
- local collector ID BGBM
- link to example specimen(s)
- external resource: wikidata
- external resource: HUH
- external resource VIAF
- problem flag
... and ask a student assistant to search for collectors in wikidata / HUH / VIAF and enter the (URI) identifiers.
By starting with frequent collectors we hope to be able to achieve a wide coverage with reasonable efforts. It would be great if other herbaria could also start to work into this spreadsheet. In this case we would probably just have to add more fields for local collector IDs.
The same workflow could be implemented for geographic features.
Linking to institutions
The stable identifiers include a domain name that is the property of the institution responsible for the specimens. So the first Identifier workshops expresed no need to link to the institution. But as this kind of "implicit link" is not usable by machine on the Web of data, it could be worth adding.
The current version of CSPP link only specimen to a web page that could be an instutional website, maybe adding http://rs.tdwg.org/dwc/terms/#institutionID attribute could be useful.
For the values of those attribute we have two candidates :
1)http://grbio.org/
http://biocol.org/urn:lsid:biocol.org:col:34988 for MNHN Paris http://biocol.org/urn:lsid:biocol.org:col:15605 for Botanic Garden Meise
Pro : community managed CETAF and TDWG related Con : URI not resolved and seems there is no data attached
https://www.wikidata.org/ =
https://www.wikidata.org/wiki/Q838691 for MNHN Paris https://www.wikidata.org/wiki/Q3052500 for Botanic Garden Meise
Pro : many related information, URI resolved (Html only ?)
CETAF collection data index
[BGBM]: As a first step, we created a list of CETAF identifiers found in GBIF.
As of October 17th, 2017, the 13 institutions listed on http://cetaf.org/cetaf-stable-identifiers shared
- 33,177,510 occurrences with GBIF, of which
- 30,679,787 used a GUID (http://rs.tdwg.org/dwc/terms/occurrenceID), of which
- 22,040,872 are HTTP URIs starting with http://,
- 21,812,600 URIs conform with the base URLs listed on http://cetaf.org/cetaf-stable-identifiers.
The situation at the BR herbarium, Meise
We have manually linked our top 900 collectors to the HUH. This was done manually to ensure that biographical details matched in our database and in HUH. In the process we identified about 230 collectors that were not at HUH. We have since given details of these collectors to HUH so that they can improve their data and we can complete the link for these additional collectors. Currently we are digitising a very large numbers of specimens (>1,000,000) so the number of collectors will increase and their frequencies will change. Therefore, we will conduct more linking once these data are available.
Our new specimen portal [1] has stable identifiers and has a machine readable RDF version of each specimen. Within this RDF is the link to the HUH database.
... <rdf:Description rdf:about="Glaziou A."> <owl:sameAs rdf:resource="http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/0832e613-7879-4f72-89f9-78e55c6ac1a9"/> <dwc:recordedBy>Glaziou A.</dwc:recordedBy> </rdf:Description> ...
[Anton Güntsch (BGBM)]: Once we have completed our top (say) 500 collectors we would be very interested in organising a shared list of collectors with links to external ressources. For example, our list will have the HUH ID and also IDs to VIAF and WikiData. Meise could then easily retrieve VIAF and WikiData IDs using the HUH IDs.