Using RML to transform RDA to BIBFRAME

We have been developing a mapping document to convert RDA-in-RDF created during testing of the Sinopia Linked Data Editor at the University of Washington Libraries to BIBFRAME. In order to implement this mapping, we have been experimenting with the RDF Mapping Language (RML) produced by Ghent University’s IDLab. Using RML, we are able to transform RDA data as represented in RDF/XML into BIBFRAME data in either Turtle or N-quads.

Our progress so far is viewable in our GitHub repository, including sample data for transformation, working mapping documents, and examples of the output we have gotten using RMLMapper. Our most extensive mapping document (workMonographMap.xml.ttl) is designed to transform data created using our monograph application profile for entities classed as an RDA Work.

We hope that our RML mapping documents can serve as examples to others who are trying to utilize RML, as the current specifications for RML are an unofficial draft (last updated July 2020). The examples in the specifications represent relatively simple transformations, leaving RML users to craft their own solutions for any data or transformation slightly more complicated.

For example, there is not a clear example in the specs for RML as to how to generate a blank node, something which our RDA-to-BIBFRAME mapping relies heavily upon. Our construction for mapping blank nodes in RML is based on the example provided here in a KnowledgeLinks GitHub repository. Here is an example of how we are constructing blank nodes in our RML mapping:

Data

<?xml version=”1.0” encoding=”UTF-8”?>
<rdf:RDF
  xmlns:rdaw="http://rdaregistry.info/Elements/w/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  <rdf:Description rdf:about=”http://example.org/ExampleRecord”>
    <rdaw:P10223 xml:lang=”en”>Dostoevsky and Kant</rdaw:P10223>
  </rdf:Description>
</rdf:RDF>

RML Map

@prefix bf: <http://id.loc.gov/ontologies/bibframe/>.
@prefix ex: <http://example.org/rules/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.

ex:ExampleTriplesMap a rr:TriplesMap;
  rml:logicalSource [
     rml:source “/exampleData.xml”;
    rml:referenceFormulation ql:XPath;
     rml:iterator “/RDF/Description”
  ].

  ex:ExampleTriplesMap rr:subjectMap [
    rml:reference “@about”;
    rr:class bf:Work
  ].

  ex:ExampleTriplesMap rr:predicateObjectMap [
    rr:predicate bf:title;
    rr:objectMap [
      rr:parentTriplesMap ex:TitleMap
    ]
  ].

  ex:TitleMap a rr:TriplesMap;
    rml:logicalSource [
    rml:source “/exampleData.xml”;
    rml:referenceFormulation ql:XPath;
    rml:iterator “/RDF/Description”
  ].

  ex:TitleMap rr:subjectMap [
    rr:termType rr:BlankNode;
    rr:class bf:Title
  ].

  ex:TitleMap rr:predicateObjectMap [
    rr:predicate bf:mainTitle;
    rr:objectMap [
      rml:reference “P10223”;
      rr:termType rr:Literal
    ]
  ].

Output:

@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .

<http://example.org/ExampleRecord> a bf:Work;
  bf:title _:0 .

_:0 a bf:Title;
  bf:mainTitle "Dostoevsky and Kant" .

It would also be useful if the RML team added examples that demonstrate the use of XPath functions either in the RML specs, or in their tutorial for writing RML for an XML file. Using RML has required us to become more familiar with XPath in order to properly utilize the rml:reference and rml:iterator properties.

For example, in our data, many RDA properties are used with both IRI and literal values. When mapping from RDA to BF, we need to be able to handle these values differently depending on whether the value is an IRI or literal. Here is an example:

Data

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
  xmlns:rdaw="http://rdaregistry.info/Elements/w/"
  xmlns:bf="http://id.loc.gov/ontologies/bibframe/"
  xmlns:bflc="http://id.loc.gov/ontologies/bflc/"
  xmlns:uwx="https://doi.org/10.6069/uwlib.55.d.4#"
  xmlns:ns5="http://sinopia.io/vocabulary/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  <rdf:Description rdf:about="http://example.org/ExampleRecord">
    <rdaw:P10256>Dostoyevsky, Fyodor, 1821-1881--Ethics</rdaw:P10256>
    <rdaw:P10256 rdf:resource="http://id.loc.gov/authorities/subjects/sh85077524"/>
  </rdf:Description>
</rdf:RDF>

RML Map

@prefix bf: <http://id.loc.gov/ontologies/bibframe/>.
@prefix ex: <http://example.org/rules/>.
@prefix madsrdf: <http://www.loc.gov/mads/rdf/v1#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.

ex:ExampleTriplesMap a rr:TriplesMap;
  rml:logicalSource [
    rml:source “/exampleData.xml”;
    rml:referenceFormulation ql:XPath;
    rml:iterator “/RDF/Description”
  ].

  ex:ExampleTriplesMap rr:subjectMap [
    rml:reference “@about”;
    rr:class bf:Work
  ].

  # for IRIs
  ex:ExampleTriplesMap rr:predicateObjectMap [
    rr:predicate bf:subject;
    rr:objectMap [
      rml:reference "P10256/@resource";
      rr:termType rr:IRI
    ]
  ].

  # for literals
  ex:ExampleTriplesMap rr:predicateObjectMap [
    rr:predicate bf:subject;
    rr:objectMap [
      rr:parentTriplesMap ex:SubjectLiteralMap
    ]
  ].

ex:SubjectLiteralMap a rr:TriplesMap;
  rml:logicalSource [
    rml:source "/exampleData.xml";
    rml:referenceFormulation ql:XPath;
    rml:iterator "RDF/Description[P10256[not(@resource)]]"
  ].

  ex:SubjectLiteralMap rr:subjectMap [
    rr:termType rr:BlankNode;
    rr:class madsrdf:Authority
  ].

  ex:SubjectLiteralMap rr:predicateObjectMap [
    rr:predicate madsrdf:authoritativeLabel;
    rr:objectMap [
      rml:reference "P10256[not(@resource)]";
      rr:termType rr:Literal
    ]
  ].

Output

@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
@prefix madsrdf: <http://www.loc.gov/mads/rdf/v1#> .

<http://example.org/ExampleRecord> a bf:Work;
  bf:subject <http://id.loc.gov/authorities/subjects/sh85077524>, _:0 .

_:0 a madsrdf:Authority;
  madsrdf:authoritativeLabel "Dostoyevsky, Fyodor, 1821-1881--Ethics" .

If we do not include [not(@resource)] in the XPath expressions in the ex:SubjectLiteralMap, RML may generate a blank node classed as an madsrdf:Authority even if there is no literal value to plug in, or empty quotes where it expects a literal value to be:

Output:

@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
@prefix madsrdf: <http://www.loc.gov/mads/rdf/v1#> .

<http://example.org/ExampleRecord> a bf:Work;
  bf:subject <http://id.loc.gov/authorities/subjects/sh85077524>, _:0 .

_:0 a madsrdf:Authority;
  madsrdf:authoritativeLabel "", "Dostoyevsky, Fyodor, 1821-1881--Ethics" .

In looking closely at the RML specs, we also discovered cases in which not all classes and properties available for use in RML are present in the documentation.

In creating our mapping documents, we accidentally stumbled upon the property rml:languageMap. In the RML specs, there are instructions for the property rr:language, which is used to apply a language tag to a literal value. It was important to us to preserve our language tags in our transformation from RDA to BF, but the property rr:language was too limiting for our purposes. The value of rr:language must be a language tag (e.g. “en-us”) that will be applied to all literals transformed by that triples map. However, our RDA data contains literals tagged with many different languages, and we could not use this property because there was no one language we could accurately tag all literals with. We were about to raise this issue in the RMLMapper GitHub repository to see if the team was working on a solution, but then we tried to create a triples map that contained a property that is not in the specs for RML, but seemed like it should have been (in other words, we guessed): rml:languageMap. Surprisingly, it worked! Here is an example of it in action:

Data

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
  xmlns:rdaw="http://rdaregistry.info/Elements/w/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  <rdf:Description rdf:about="http://example.org/ExampleRecord">
    <rdaw:P10331 xml:lang="en">Dostoyevsky, Fyodor, 1821-1881. Idiot</rdaw:P10331>
    <rdaw:P10332 xml:lang="ru">Достоевский, Федор, 1821-1881. Идиот</rdaw:P10332>
  </rdf:Description>
</rdf:RDF>

RML Map

@prefix bf: <http://id.loc.gov/ontologies/bibframe/>.
@prefix ex: <http://example.org/rules/>.
@prefix rdaw: <http://rdaregistry.info/Elements/w/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.

ex:WorkMap a rr:TriplesMap;
  rml:logicalSource [
    rml:source "/example.xml";
    rml:referenceFormulation ql:XPath;
    rml:iterator "/RDF/Description"
  ].

  ex:WorkMap rr:subjectMap [
    rml:reference "@about";
    rr:class bf:Work
  ].

  ex:WorkMap rr:predicateObjectMap [
    rr:predicate bf:identifiedBy;
    rr:objectMap [
      rr:parentTriplesMap ex:AuthorizedAccessPointMap
    ]
  ].

  ex:WorkMap rr:predicateObjectMap [
    rr:predicate bf:identifiedBy;
    rr:objectMap [
      rr:parentTriplesMap ex:VariantAccessPointMap
    ]
  ].

  ex:AuthorizedAccessPointMap a rr:TriplesMap;
    rml:logicalSource [
    rml:source "/example.xml";
    rml:referenceFormulation ql:XPath;
    rml:iterator "/RDF/Description[P10331]"
  ].

  ex:AuthorizedAccessPointMap rr:subjectMap [
    rr:termType rr:BlankNode;
    rr:class bf:Identifier
  ].

  # for literals with a language tag 
  ex:AuthorizedAccessPointMap rr:predicateObjectMap [
    rr:predicate rdf:value;
    rr:objectMap [
      rml:reference "P10331[@lang]";
      rr:termType rr:Literal;
      rml:languageMap [
        rml:reference "P10331/@lang"
      ]
    ]
  ].

  # for literals without a language tag 
  ex:AuthorizedAccessPointMap rr:predicateObjectMap [
    rr:predicate rdf:value;
    rr:objectMap [
      rml:reference "P10331[not(@lang)]";
      rr:termType rr:Literal
    ]
  ].

ex:VariantAccessPointMap a rr:TriplesMap;
  rml:logicalSource [
    rml:source "/example.xml";
    rml:referenceFormulation ql:XPath;
    rml:iterator "/RDF/Description[P10332]"
  ].

  ex:VariantAccessPointMap rr:subjectMap [
    rr:termType rr:BlankNode;
    rr:class bf:Identifier
  ].

  # for literals with a language tag
  ex:VariantAccessPointMap rr:predicateObjectMap [
    rr:predicate rdf:value;
    rr:objectMap [
      rml:reference "P10332[@lang]";
      rr:termType rr:Literal;
      rml:languageMap [
        rml:reference "P10332/@lang"
      ]
    ]
  ].

  # for literals without a language tag
  ex:VariantAccessPointMap rr:predicateObjectMap [
    rr:predicate rdf:value;
    rr:objectMap [
      rml:reference "P10332[not(@lang)]";
      rr:termType rr:Literal
    ]
  ].

Output

@prefix bf: <http://id.loc.gov/ontologies/bibframe/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<https://example.org/ExampleRecord> a bf:Work;
  bf:identifiedBy _:0 .

_:0 a bf:Identifier;
  rdf:value "Dostoyevsky, Fyodor, 1821-1881. Idiot"@en .

<https://example.org/ExampleRecord> bf:identifiedBy _:1 .

_:1 a bf:Identifier;
  rdf:value "Достоевский, Федор, 1821-1881. Идиот"@ru .

Here, RML is able to correctly label these two identifiers using the language tags present in the original RDF/XML.

However, we haven’t perfected our mappings using rml:languageMap yet, namely in instances where one property has multiple values all with different languages. Here is an example of that:

Data

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
  xmlns:rdaw="http://rdaregistry.info/Elements/w/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  <rdf:Description rdf:about="http://example.org/ExampleRecord">
    <rdaw:P10086 xml:lang="pt">Lórax (Beber)</rdaw:P10086>
    <rdaw:P10086 xml:lang="af">Loraks</rdaw:P10086>
    <rdaw:P10086 xml:lang="ru">Driad</rdaw:P10086>
    <rdaw:P10086 xml:lang="es">Lórax</rdaw:P10086>
  </rdf:Description>
</rdf:RDF>

RML Map

@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix ex: <http://example.org/rules/>.
@prefix bf: <http://id.loc.gov/ontologies/bibframe/>.
@prefix rdaw: <http://rdaregistry.info/Elements/w/>.

ex:WorkMap a rr:TriplesMap;
  rml:logicalSource [
    rml:source "exampleData.xml";
    rml:referenceFormulation ql:XPath;
    rml:iterator "/RDF/Description"
  ].

  ex:WorkMap rr:subjectMap [
    rml:reference "@about";
    rr:class bf:Work
  ].

  ex:WorkMap rr:predicateObjectMap [
    rr:predicate bf:title;
    rr:objectMap [
      rr:parentTriplesMap ex:VariantTitleMap
    ]
  ].

ex:VariantTitleMap a rr:TriplesMap;
  rml:logicalSource [
    rml:source "exampleData.xml";
    rml:referenceFormulation ql:XPath;
    rml:iterator "/RDF/Description[P10086]"
  ].

  ex:VariantTitleMap rr:subjectMap [
    rr:termType rr:BlankNode;
    rr:class bf:VariantTitle
  ].

  ex:VariantTitleMap rr:predicateObjectMap [
    rr:predicate bf:mainTitle;
    rr:objectMap [
      rml:reference "P10086";
      rr:termType rr:Literal;
      rml:languageMap [
        rml:reference "P10086/@lang"
      ]
    ]
  ].

Output

<http://example.org/ExampleRecord>
  bf:title _:0 .

_:0 a bf:VariantTitle;
  bf:mainTitle "Driad"@pt, "Loraks"@pt, "Lórax"@pt, "Lórax (Beber)"@pt .

Here we see rml:languageMap working and not working. We can see that it is going into the data and pulling out the value for xml:lang for the property rdaw:P10086 (has variant title of work). However, it seems to only be looking for the first instance of the property P10086 having the attribute lang, and not looking for the P10086/@lang value for each repeated instance of the property. This issue has been raised with developers, and we are hoping to come up with a solution.

However, this is an example where it would be useful to not only have rml:languageMap present in the RML specs, but also examples that demonstrate how it can be used. This leads us to wonder whether there is yet more functionality within RML that we aren’t yet aware of—for example, other RML properties not yet listed in the index within the RML specs.

One thought on “Using RML to transform RDA to BIBFRAME

  1. Thank you very much for the helpful examples of mapping. I just wanted to mention that to get them to work I need to add prefixes. Unfortunately, uw.edu does not let me add a fixed source code.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *