Difference between revisions of "User:Andreas Plank/Import issues with CETAF identifiers"

From CETAF Identifiers Wiki
Jump to: navigation, search
m (id.zfmk.de (ZFMK): +stable URL https://id.zfmk.de/collection_ZFMK/page/CollectionSpecimen/1650 does not return any RDF)
m (purl.org/nhmuio (NHMUO): +No or mixed up RDF description of CETAF-ID)
Line 17: Line 17:
 
</blockquote>
 
</blockquote>
  
== No or mixed up RDF description ==
+
== No or mixed up RDF description of {{abbr|CETAF-ID}} ==
  
 
See perhaps the [[CETAF Specimen Preview Profile (CSPP)#example_CSPP-compliant_RDF|example of CETAF Specimen Preview Profile (CSPP)]] in general.
 
See perhaps the [[CETAF Specimen Preview Profile (CSPP)#example_CSPP-compliant_RDF|example of CETAF Specimen Preview Profile (CSPP)]] in general.
Line 23: Line 23:
 
=== id.luomus.fi ({{abbr|LUOMUS}}) ===
 
=== id.luomus.fi ({{abbr|LUOMUS}}) ===
  
({{Tobedone}}) The requested RDF does not describe the requested {{abbr|CETAF-ID}} <code><nowiki>http://id.luomus.fi/GL.749</nowiki></code> itself, the ID “hangs somewhat in the air” (from a description point of view):
+
({{Tobedone}}) The requested RDF does not describe the requested {{abbr|CETAF-ID}} <code><nowiki>http://id.luomus.fi/GL.749</nowiki></code> itself, the ID “hangs somewhat in the air” (from a descriptive point of view):
 
<blockquote>
 
<blockquote>
 
# http://id.luomus.fi/GL.749 gets redirected to http://id.luomus.fi/GL.749?format=RDFXML and
 
# http://id.luomus.fi/GL.749 gets redirected to http://id.luomus.fi/GL.749?format=RDFXML and
Line 33: Line 33:
 
=== id.zfmk.de ({{abbr|ZFMK}}) ===
 
=== id.zfmk.de ({{abbr|ZFMK}}) ===
  
({{Tobedone}}) The requested RDF does not describe the requested {{abbr|CETAF-ID}} <code><nowiki>http://id.zfmk.de/collection_ZFMK/1650/733377/90217</nowiki></code> itself, the ID “hangs somewhat in the air” (from a description point of view):
+
({{Tobedone}}) The requested RDF does not describe the requested {{abbr|CETAF-ID}} <code><nowiki>http://id.zfmk.de/collection_ZFMK/1650/733377/90217</nowiki></code> itself, the ID “hangs somewhat in the air” (from a descriptive point of view):
 
<blockquote>
 
<blockquote>
 
# http://id.zfmk.de/collection_ZFMK/1650/733377/90217 gets redirected to https://id.zfmk.de/collection_ZFMK/rdf/xml/CollectionSpecimen/1650/733377/90217/?shorturl=1 and
 
# http://id.zfmk.de/collection_ZFMK/1650/733377/90217 gets redirected to https://id.zfmk.de/collection_ZFMK/rdf/xml/CollectionSpecimen/1650/733377/90217/?shorturl=1 and
Line 41: Line 41:
 
--[[User:Andreas Plank|Andreas Plank]] ([[User talk:Andreas Plank|talk]]) 12:29, 20 February 2020 (CET)
 
--[[User:Andreas Plank|Andreas Plank]] ([[User talk:Andreas Plank|talk]]) 12:29, 20 February 2020 (CET)
 
</blockquote>
 
</blockquote>
 +
 +
=== purl.org/nhmuio ({{abbr|NHMUO}}) ===
 +
({{Tobedone}}) The requested RDF does not describe the requested {{abbr|CETAF-ID}} <code><nowiki>http://purl.org/nhmuio/id/41d9cbb4-4590-4265-8079-ca44d46d27c3</nowiki></code> itself, the ID “hangs somewhat in the air” (from a descriptive point of view):
 +
<blockquote>
 +
# http://purl.org/nhmuio/id/41d9cbb4-4590-4265-8079-ca44d46d27c3 gets redirected to https://data.gbif.no/resolver/O:L:14 and
 +
# by analysing the RDF via Apache Jena’s <code>rdfparse</code> it reveals that it describes something other: <code><nowiki>http://purl.org/gbifnorway/id/O:L:14</nowiki></code>, but unrelated to the ID
 +
# <code><nowiki>http://purl.org/nhmuio/id/41d9cbb4-4590-4265-8079-ca44d46d27c3</nowiki></code> itself has no related description (<code>rdf:Description</code>) and “hangs somewhat in the air”
 +
--[[User:Andreas Plank|Andreas Plank]] ([[User talk:Andreas Plank|talk]]) 13:30, 20 February 2020 (CET)
 +
</blockquote>
 +
  
 
== No RDF but HTML ==
 
== No RDF but HTML ==
== col.smns-bw.org ({{abbr|SMNS}}) ==
+
=== col.smns-bw.org ({{abbr|SMNS}}) ===
  
 
({{Tobedone}}) Requested RDF is instead an HTML fragment but RDF.--[[User:Andreas Plank|Andreas Plank]] ([[User talk:Andreas Plank|talk]]) 14:38, 18 February 2020 (CET)
 
({{Tobedone}}) Requested RDF is instead an HTML fragment but RDF.--[[User:Andreas Plank|Andreas Plank]] ([[User talk:Andreas Plank|talk]]) 14:38, 18 February 2020 (CET)

Revision as of 13:31, 20 February 2020


Screenshot Firefox Plugin RESTED get an RDF-resource (20200218).png
Screenshot of the Firefox RESTED plugin (steps to retrieve an RDF data source)

Note: Unresolved or pending issues are on top and issues that are done get to the end. To check for RDF in your browser you can (1) use the CETAF Specimen URI Tester (http://herbal.rbge.info) or use a plugin in your browser, e.g. RESTED Client and then adding Header Accept: application/rdf+xml (see example aside)



data.nhm.ac.uk (NHM)

(Work in progress: pending Pending (minor issue does not block)) Requesting “Content-Type: application/rdf+xml” results in 404 (not found) instead of getting RDF (see https://github.com/NaturalHistoryMuseum/ckanext-nhm/issues/458) --Andreas Plank (talk) 14:06, 18 February 2020 (CET)

  • minor issue not relevant because header “Content-Type: application/rdf+xml” is meant for the (returned) resource, not the request --Andreas Plank (talk) 10:40, 20 February 2020 (CET)

No or mixed up RDF description of CETAF-ID

See perhaps the example of CETAF Specimen Preview Profile (CSPP) in general.

id.luomus.fi (LUOMUS)

(Work in progress: pending Pending) The requested RDF does not describe the requested CETAF-ID http://id.luomus.fi/GL.749 itself, the ID “hangs somewhat in the air” (from a descriptive point of view):

  1. http://id.luomus.fi/GL.749 gets redirected to http://id.luomus.fi/GL.749?format=RDFXML and
  2. by analysing the RDF via Apache Jena’s rdfparse it reveals that it describes <http://id.luomus.fi/GL.749?format=RDFXML> <http://purl.org/dc/terms/subject> <http://id.luomus.fi/GL.749> just to be related, but
  3. http://id.luomus.fi/GL.749 itself has no related description (rdf:Description) but there are two descriptions http://tun.fi/MY.275076 and http://tun.fi/MY.881682 which do not relate to http://id.luomus.fi/GL.749. So CETAF-ID http://id.luomus.fi/GL.749 “hangs somewhat in the air” because it is not described.

--Andreas Plank (talk) 12:10, 20 February 2020 (CET)

id.zfmk.de (ZFMK)

(Work in progress: pending Pending) The requested RDF does not describe the requested CETAF-ID http://id.zfmk.de/collection_ZFMK/1650/733377/90217 itself, the ID “hangs somewhat in the air” (from a descriptive point of view):

  1. http://id.zfmk.de/collection_ZFMK/1650/733377/90217 gets redirected to https://id.zfmk.de/collection_ZFMK/rdf/xml/CollectionSpecimen/1650/733377/90217/?shorturl=1 and
  2. by analysing the RDF via Apache Jena’s rdfparse it reveals that it describes something other: https://id.zfmk.de/collection_ZFMK/1650, but unrelated to the ID
  3. http://id.zfmk.de/collection_ZFMK/1650/733377/90217 itself has no related description (rdf:Description) and “hangs somewhat in the air”
  4. checking the website states a stable URL https://id.zfmk.de/collection_ZFMK/page/CollectionSpecimen/1650 but this very URL does not return any RDF

--Andreas Plank (talk) 12:29, 20 February 2020 (CET)

purl.org/nhmuio (NHMUO)

(Work in progress: pending Pending) The requested RDF does not describe the requested CETAF-ID http://purl.org/nhmuio/id/41d9cbb4-4590-4265-8079-ca44d46d27c3 itself, the ID “hangs somewhat in the air” (from a descriptive point of view):

  1. http://purl.org/nhmuio/id/41d9cbb4-4590-4265-8079-ca44d46d27c3 gets redirected to https://data.gbif.no/resolver/O:L:14 and
  2. by analysing the RDF via Apache Jena’s rdfparse it reveals that it describes something other: http://purl.org/gbifnorway/id/O:L:14, but unrelated to the ID
  3. http://purl.org/nhmuio/id/41d9cbb4-4590-4265-8079-ca44d46d27c3 itself has no related description (rdf:Description) and “hangs somewhat in the air”

--Andreas Plank (talk) 13:30, 20 February 2020 (CET)


No RDF but HTML

col.smns-bw.org (SMNS)

(Work in progress: pending Pending) Requested RDF is instead an HTML fragment but RDF.--Andreas Plank (talk) 14:38, 18 February 2020 (CET)

For instance under Linux:

wget --header='Accept: application/rdf+xml'  --content-on-error --output-document="col.smns-bw.org⁄object⁄S10000227722006.rdf" "http://col.smns-bw.org/object/S10000227722006"
file col.smns-bw.org⁄object⁄S10000227722006.rdf
# col.smns-bw.org⁄object⁄S10000227722006.rdf: HTML document, ISO-8859 text, with very long lines, with CRLF line terminators

specimens.kew.org (RBGK)

(Work in progress: pending Pending) Requested RDF is instead HTML but RDF --Andreas Plank (talk) 14:32, 18 February 2020 (CET)

For instance under Linux:

wget --header='Accept: application/rdf+xml'  --content-on-error --output-document="specimens.kew.org⁄herbarium⁄K001116483.rdf" "http://specimens.kew.org/herbarium/K001116483"
file specimens.kew.org⁄herbarium⁄K001116483.rdf 
# specimens.kew.org⁄herbarium⁄K001116483.rdf: HTML document, ASCII text, with very long lines, with CRLF, LF line terminators

Fixed Issues

herbarium.bgbm.org (BGBM)

( Done) In some RDF files are invalid URI entries i.e. there is a tab/space character in the URI in owl:sameAs and this would break the whole import of data. The error log of triple store loader (tdbloader2) shows something like:

Bad URI: < http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/a86596ea-6f4d-4b97-bf6f-8d492c0fc8b2> Code: 0/ILLEGAL_CHARACTER in SCHEME: The character violates the grammar rules for URIs/IRIs. ERROR Bad character in IRI (space): <[space]...>

… see for instance in line 63:

62 <rdf:Description rdf:about="http://www.wikidata.org/entity/Q6382619">
63                     <owl:sameAs rdf:resource="	http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/a86596ea-6f4d-4b97-bf6f-8d492c0fc8b2" />
64                 <owl:sameAs rdf:resource="http://viaf.org/viaf/233473288" />
65           </rdf:Description>

The following objects were detected: