Gordon Dunsire on property chains and shortcuts in RDA/LRM

This is the first of two posts on the panel discussion “What role can RDA/RDF play in the transition to linked library data?” which took place during the 5th annual meeting of the BIBFRAME Workshop in Europe [1], and featured comments from five distinguished panelists [2]. This post is a transcription of the response from Gordon Dunsire (former chair of the RDA Steering Committee and current member of the RSC Technical Working Group [3]) to a question posed at the BIBFRAME Workshop, followed by an annotated version (annotated by Theo Gerontakos) engaging some of the areas discussed.

The question:

RDA/LRM/RDF contains elements that are essentially reified property chains in RDA, known as shortcuts, while BIBFRAME features many property chains that are not reified as specific elements – what many people call “nested” data. What are the pros and cons of property chains and reification, and do they cause interoperability problems?

Response (transcription):

The shortcut elements in RDA are essentially pre-LRM elements that map to a chain of two or more post-LRM elements. This shortcut mechanism is a feature of the LRM and the associated CIDOC Conceptual Reference Model. The shortcut elements are retained in RDA to further a goal of the 3R Project to support continuity and minimize disruption for RDA communities. In future, new shortcut elements are likely to be useful for local extensions and applications.

A shortcut element is a simplified view of a path through a complex structure (it’s a path through a complex graph). Only the start and end of the path are identified; a classic example is the “work manifested” element that ignores the intermediary expression entity (it connects a work directly to the manifestation). Such simplification benefits the local application by cloaking unnecessary complexity and reducing the burden of identity management, but this is at the expense of external applications that wish to re-use the metadata. A service that focuses on expressions, such as translations, will not find ‘work manifested’ statements very useful (since they don’t mention expressions).

RDA Toolkit therefore displays a standard note for each shortcut element to warn that the instances of intermediary entities are not identified when the element is used: a typical warning is, “The element does not identify any expression that realizes the work”. This exposes the trade-off when using a shortcut: it’s convenient for local use, but less of a contribution to the wider community.

Shortcuts are also useful for mappings between ontologies. For example, the map between RDA and LRM elements includes many mappings between a fine-grained RDA relationship element and the broadest LRM element “thing is associated with thing”. The precision of the mapping can be increased by substituting an implicit LRM shortcut as the target. Such a shortcut is not reified in the LRM itself, so the simple RDFS mapping properties such as subPropertyOf cannot be used in a map between RDA and LRM that uses shortcuts. A more complex property, such as OWL propertyChainAxiom, is required, resulting in increased overheads in using the map for the interoperability of instance data.  This also has a negative impact on reuse of data by applications that do not use Web Ontology Language properties (many applications prefer to remain with an RDFS syntax for example).

There are policy and management issues associated with property chains that are not reified. If RDA uses property chains in a map to another ontology it is effectively reifying the chain in place of the reification that has not already been made by the maintainer of the target ontology. This is a question of authority and branding: is the reification part of RDA or the other ontology? (It will certainly use an RDA namespace, for example, and be branded as such.) What happens if the other ontology changes the component properties of the chain? The RDA Steering Committee has established communication protocols with the maintainers of other relevant ontologies to alleviate the problems that can arise (not just the shortcuts of course). Such protocols can also inform the development of the target ontology itself: an externally-reified shortcut that is heavily used is an indicator of a candidate for internal reification. For example, a local RDA community may identify a useful RDA property chain, reify it as a Community Resource element, and propose it as a new RDA shortcut element with appropriate evidence of its general utility.

More generally, these are issues of identity, authority, and transparency. A community can reduce its own costs by not identifying everything in its own linked data, but that increases the costs for other communities who wish to reuse that data. Local identification is often regarded as authoritative identification, so a lack of reification may be interpreted as a quality and trust issue. Finally, that which is not identified is invisible in the Semantic Web, a form of ‘dark’ linked data.

Response with annotations (Dunsire’s text underlined):

The shortcut elements in RDA…

The RDA Toolkit [4] provides the following description of RDA shortcuts:

“A shortcut is a relationship element that directly relates two RDA entities that are indirectly related through one or more intermediary entities.
“This allows the two entities to be associated without recording any of the intermediary entities or relationships.
“Information about an intermediary entity cannot be inferred from the value of a shortcut element.
“For example, Manifestation: work manifested [http://rdaregistry.info/Elements/m/P30135] relates a manifestation and a work. It is a shortcut for:
    “1. Manifestation: expression manifested                [http://rdaregistry.info/Elements/m/P30139]
    “2. Expression: work expressed [http://rdaregistry.info/Elements/e/P20231]
“There is one intermediary entity, an expression that is embodied by the manifestation and is a realization of the work.
“A value of this shortcut contains no information about the intermediary expression.”

Other shortcut examples include:

Manifestation: name of producer, http://rdaregistry.info/Elements/m/P30174 .
Nomen: name of producer of, http://rdaregistry.info/Elements/n/P80117 .
Manifestation: contributor family of still image, http://rdaregistry.info/Elements/m/P30430 .
Manifestation: accessibility content, http://rdaregistry.info/Elements/m/P30452 .

…are essentially pre-LRM elements…

RDA (Resource Description and Access) was initially released in June 2010. It was based on most of FRBR [5], some FRAD [6], and all of FRSAD [7].

IFLA’s Library Reference Model [8], a high-level conceptual model, was released in 2017. It consolidated the FR family of models: FRBR (1998), FRAD (2009), FRSAD (2010) and the Working group on Aggregates report (2011) [9].

RDA content was edited to comply with the LRM. The RDA/LRM beta release was in 2018.

“Pre-LRM” elements are defined in the earlier versions of RDA; at UW Libraries, we follow Gordon Dunsire in calling these earlier versions original RDA.

At UW libraries, we follow Gordon Dunsire in calling post-LRM (i.e. LRM-compliant RDA) RDA/LRM. We call its linked data version RDA/LRM/RDF.

…that map to a chain of two or more post-LRM elements. This shortcut mechanism is a feature of the LRM and the associated CIDOC Conceptual Reference Model. 

Shortcuts are introduced in LRM 4.3. “Relationships,” as follows:

“When a particular path is frequently required in a particular application, it can be implemented as a single relationship which serves as a shortcut for the more developed path. The intermediate node(s) or entities become implicit. One shortcut is sufficiently important that it is declared in the model:
    “(LRM-R15) NOMEN ‘is equivalent to’ NOMEN is the same as the following pair of relationships:
    “(LRM-R13i) NOMEN1 ‘is appellation of’ RES +
    “(LRM-R13) RES ‘has appellation’ NOMEN2
“The entity subclass/superclass structure (the ‘isA’ hierarchy) [10] can also be used in a path to restrict the domain or range entities in a relationship. The pair of statements:
    “(isA) PERSON isA AGENT +
    “(LRM-R5i) AGENT ‘created’ WORK
“imply the shortcut relationship:
    “PERSON ‘created’ WORK
“This latter specific relationship can be implemented directly if it is considered desirable.”

The CIDOC Conceptual Reference Model (CIDOC-CRM) [11] is an ontology for museum and cultural heritage documentation. It became an international standard in 2014 (ISO 21127:2014). In 2006, its editors began harmonizing CIDOC-CRM with FRBR, resulting in FRBRoo (an object-oriented definition of FRBR). In the LRM, section 2.4, “Relationships to Other Models,” we find the following:

“In the same time-period as the IFLA Library Reference Model was being developed, a parallel process was taking place in the object-oriented definition of FRBR. FRBROO version 1.0 (first published in 2009) expressed the original FRBR model as an extension of the CIDOC Conceptual Reference Model (CIDOC CRM) for museum information. It was expanded to include the entities, attributes and relationships declared in FRAD and FRSAD, resulting in FRBROO version 2.4 (approved in 2016). The modelling exercise behind that expansion informed the work of consolidation being undertaken in the entity-relationship formalism of the model, but did not predetermine any of the decisions taken in the definition of the IFLA LRM model. IFLA LRM aims to be a very general high-level model; it includes less detail compared to FRBROO, which seeks to be comparable in terms of generality with CIDOC CRM.”

In the “Terminology” section of the CIDOC-CRM Introduction, we find the following description of a shortcut:

“A shortcut is a formally defined single property that represents a deduction or join of a data path in the CIDOC CRM. The scope notes of all properties characterized as shortcuts describe in words the equivalent deduction. Shortcuts are introduced for the cases where common documentation practice refers only to the deduction rather than to the fully developed path. For example, museums often only record the dimension of an object without documenting the Measurement that observed it. The CIDOC CRM declares shortcuts explicitly as single properties in order to allow the user to describe cases in which he has less detailed knowledge than the full data path would need to be described. For each shortcut, the CIDOC CRM contains in its schema the properties of the full data path explaining the shortcut.”

The shortcut elements are retained in RDA to further a goal of the 3R Project…

The 3R Project — the RDA Toolkit Restructure and Redesign Project — was a project to develop a responsive design for the RDA Toolkit interface, and develop a new infrastructure for maintaining and publishing Toolkit releases. The LRM component was added to the initial proposal and adopted as part of the 3R Project. The new RDA Toolkit was a rebuild and included the new RDA/LRM entities and properties. It was completed in 2019, and replaced the previous RDA Toolkit at the end of 2020. 

…to support continuity and minimize disruption for RDA communities. In future, new shortcut elements are likely to be useful for local extensions and applications.

The RDA community is anticipating a proliferation of extensions and application profiles from those who implement RDA/LRM. “New shortcut elements” are also anticipated; with some exceptions, they are expected to be community-driven, presented to RSC and supported by evidence of their utility. [12]

A shortcut element is a simplified view of a path through a complex structure (it’s a path through a complex graph). Only the start and end of the path are identified; a classic example is the “work manifested” element that ignores the intermediary expression entity (it connects a work directly to the manifestation). Such simplification benefits the local application by cloaking unnecessary complexity and reducing the burden of identity management, but this is at the expense of external applications that wish to re-use the metadata. A service that focuses on expressions, such as translations, will not find ‘work manifested’ statements very useful (since they don’t mention expressions).

RDA Toolkit therefore displays a standard note for each shortcut element to warn that the instances of intermediary entities are not identified when the element is used: a typical warning is, “The element does not identify any expression that realizes the work” …

Here is a screen shot of the RDA Toolkit entry for http://rdaregistry.info/Elements/m/P30135 (work manifested), captured 2021-12-08. This is only a fragment of the element description; it is the very beginning of that description:

 

 

 

 

 

 

 

 

 

 

 

…This exposes the trade-off when using a shortcut: it’s convenient for local use, but less of a contribution to the wider community.

Shortcuts are also useful for mappings between ontologies. For example, the map between RDA and LRM elements…

This map actually exists and is available at the RDA Registry, see http://www.rdaregistry.info/Maps/.

The RDA Development Team offers (for example, at the RDA Registry) both maps and alignments between RDA and other vocabularies.

We find the following descriptions in the RDA Registry:

“A map is a set of RDF triples representing the semantic relationship between two element sets or value vocabularies.”
“We use the term ‘map’ to refer to a set of mappings. We use the term ‘mapping’ to refer to a single relationship between two classes, properties, or concepts taken from different element sets and value vocabularies.”
Alignments are the bases of maps. A map is a set of RDF triples representing the semantic relationship between two element sets or value vocabularies.”
“Alignments are given in the form of tables, usually in a comma separated variable file, containing general relationships that may ignore the precise semantics embedded in the vocabularies.”

…includes many mappings between a fine-grained RDA relationship element and the broadest LRM element “thing is associated with thing.”

For example, the mapping for the RDA manifestation element is-reference-source-of http://rdaregistry.info/Elements/m/P30457:   

        rdam:P30457    rdfs:subPropertyOf    lrmer:R1 .

That is, in the mapping, RDA:Manifestation:isReferenceSourceOf is declared a sub-property of LRM’s property “Res is associated with Res.”

The precision of the mapping can be increased by substituting an implicit LRM shortcut as the target. Such a shortcut is not reified in the LRM itself, so the simple RDFS mapping properties such as subPropertyOf cannot be used in map between RDA and LRM that uses shortcuts.

A completely different but more precise mapping than 

    RDA:Manifestation:hasWorkManifested–>LRM:Res:isAssociatedWithRes 

…would be (because the above maps to the most general property, and below maps to the specific, without ignoring the intermediary entity): 

    RDA:Manifestation:hasExpressionManifested
–>LRM:Manifestation:embodies (https://www.iflastandards.info/lrm/lrmer#R3i

And can’t map as follows:

    RDA:Manifestation:hasWorkManifested
–>LRM:Manifestation:hasWorkManifested

…because the latter property, LRM:Manifestation:hasWorkManifested, is not enumerated, or reified, in the LRM vocabulary.

 A more complex property, such as OWL propertyChainAxiom…

    https://www.w3.org/2002/07/owl#propertyChainAxiom 

which dereferences to:

    owl:propertyChainAxiom a rdf:Property ;

         rdfs:label “propertyChainAxiom” ;

         rdfs:comment “The property that determines the n-tuple of

             properties that build a sub property chain of a given

             property.” ;

         rdfs:domain owl:ObjectProperty ;

         rdfs:isDefinedBy <http://www.w3.org/2002/07/owl#> ;

         rdfs:range rdf:List .

The mapping, using this property, could be represented as follows (using english language labels in the IRIs rather than the expected opaque identifiers, e.g. rdam:P30135):

       rdam:hasWorkManifested  owl:propertyChainAxiom 

           ( lrmer:embodies  lrmer:realizes ) 

…is required, resulting in increased overheads in using the map for the interoperability of instance data.

For example, if we want to process the map, the processor would have to understand different flavors of OWL; i.e., would need to be able to process more than simple RDFS.

This also has a negative impact on reuse of data by applications that do not use Web Ontology Language properties (many applications prefer to remain with an RDFS syntax for example).

There are policy and management issues associated with property chains that are not reified.

In case it’s not clear, “target ontology” is the ontology to which we’re mapping RDA.

The policy and management issues would include decisions regarding the processes of creation of the RDA shortcut, maintaining the map, and representing the shortcut in the target ontology either as a full chain or as an overly broad element. 

If RDA uses property chains in a map to another ontology it is effectively reifying the chain in place of the reification that has not already been made by the maintainer of the target ontology. 

This is a question of authority and branding: is the reification part of RDA or the other ontology? (It will certainly use an RDA namespace, for example, and be branded as such.)

This is a complex pair of sentences, which seems necessary as the problem is complex. What is the basis, or authority, for the reification? To begin with, in the ontology to-be-mapped (RDA), if the shortcut doesn’t exist, how will it come into being? Who decides? And what property chain does it represent? Then, in the target ontology, the reification does not exist, neither as a shortcut (which is probably not even possible) nor as a property chain; yet the reification needs to be represented in the target ontology. Keep in mind that the relation of RDA and LRM is exceptional: the RDA→LRM map was created to situate RDA within an LRM context. RDA properties are contextualized as subproperties of LRM properties. Additionally, in this case, the reification is authorized by LRM and the chain is an LRM chain. RDA (through the RDA Steering Committee) reifies and maintains the shortcut, the property chain, and the mapping, but the chain is an LRM chain.

What happens if the other ontology changes the component properties of the chain? The RDA Steering Committee has established communication protocols with the maintainers of other relevant ontologies to alleviate the problems that can arise (not just the shortcuts of course). 

Surely the problems of mapping shortcuts to property chains presents added complexity to maintaining a map; for example, if a property in a property path is not explicit, any concealed property may be difficult to maintain. Nevertheless, changes in the target ontology have serious implications for any mapping. The time for this presentation was too brief to allow an elaboration on the communication protocols – but it would be interesting to know more. Many of us have discussed standards for representing mappings, application profiles, etc., but communication standards, or even best practices, for maintaining maps across ontologies would be useful to metadata professionals.

Such protocols can also inform the development of the target ontology itself: an externally-reified shortcut that is heavily used is an indicator of a candidate for internal reification.

This is just one of many reasons an ontology (or a map, or an application profile, etc.) cannot be produced, published and forgotten. It needs to be maintained for actual use.

For example, a local RDA community may identify a useful RDA property chain, reify it as a Community Resource element, and propose it as a new RDA shortcut element with appropriate evidence of its general utility.

This demonstrates the possibility of an RDA community-driven “post-legacy” shortcut. Recall in the note above [note 12] that the current shortcuts were created during the 3R project.

More generally, these are issues of identity, authority, and transparency. A community can reduce its own costs by not identifying everything in its own linked data, but that increases the costs for other communities who wish to reuse that data. Local identification is often regarded as authoritative identification, so a lack of reification may be interpreted as a quality and trust issue. Finally, that which is not identified is invisible in the Semantic Web, a form of ‘dark’ linked data.

This makes the case against the type of reification described in the presentation. Perhaps it is best to identify everything possible, including property chains, and to do so with maximum transparency, all in the interest of optimizing graph sharing.

NOTES

  1. See also RDA, BIBFRAME, and Modeling Bibliographic Relationships; for more information about the 2021 BIBFRAME Workshop in Europe see https://www.casalini.it/bfwe2021/
  2. Slides featuring brief panelist bios are available at https://bit.ly/bfwe2021rdardf
  3. http://rda-rsc.org/node/615
  4. https://www.rdatoolkit.org/ 
  5. https://repository.ifla.org/handle/123456789/830 
  6. https://repository.ifla.org/handle/123456789/757 
  7. https://repository.ifla.org/handle/123456789/835 
  8. https://repository.ifla.org/handle/123456789/40 
  9. https://www.ifla.org/wp-content/uploads/2019/05/assets/cataloguing/frbrrg/AggregatesFinalReport.pdf 
  10. Note from Gordon Dunsire, 2021-12-17: “Note that 3R decided that it was better to avoid this type of shortcut, because of the dumb-down and processing issues. Instead, the Agent relationship elements were explicitly subtyped by Agent entity subtype and ‘broken out’ into ‘creator person’, ‘creator corporate body’, ‘creator agent’, etc. I briefly discuss this in a forthcoming article on ‘Ontology in practice ….’ I will post the published article … in early January 2022.”
  11. https://cidoc-crm.org/ 
  12. This is based on the following note from Gordon Dunsire, 2021-12-17: “To clarify: Including a shortcut in RDA is an effective dumb-down for the utility of RDA metadata, and RSC would normally expect solid evidence that this benefits RDA as a whole. The legacy evidence was evaluated during 3R; future evidence should be community-driven. There are some residual 3R legacies that RSC scheduled for post-3R development, such as collection-level description, that identify ‘new’ legacy shortcuts.”

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *