ProvToolbox Tutorial 4: Templates for Provenance (part 2)

1. Introduction

This blog post is the second part of the introduction to provenance templates.

The tutorial is standalone and a zip archive can be downloaded from the following URL: http://search.maven.org/remotecontent?filepath=org/openprovenance/prov/ProvToolbox-Tutorial4/0.7.0/ProvToolbox-Tutorial4-0.7.0-src.zip. The tutorial can also be found on the ProvToolbox project on GitHub.

The tutorial assumes that provconvert has been installed and is available in the execution path. (See http://lucmoreau.github.io/ProvToolbox/ for installation instructions.) The tutorial relies on a Makefile and can simply be run by calling:

make do.all

2. Further Examples of Templates

We continue out introduction to PROV-Template by some examples.

2.1 Another Template: Quotes, Authors and Their Organisations

Our initial template could deal with attribution of quotes to authors. It is useful to be able to talk about author’s organizations. PROV offers the notion of delegation, by which we can relate an author agent to the organization agent, on behalf of which they acted. This is represented with the following template, where the link marked “del” represent the delegation association.

Template for Quote Attribution and Representation of  an Organization

Template for Quote Attribution and Representation of an Organization

The only differences in the template occur in line 11, where a new variable var:institution is introduced for the organization, and in line 12, where a delegation association is expressed with the term actedOnBehalfOf.

document
  prefix var <http://openprovenance.org/var#>
  prefix vargen <http://openprovenance.org/vargen#>
  prefix tmpl <http://openprovenance.org/tmpl#>
  prefix foaf <http://xmlns.com/foaf/0.1/>
  
  bundle vargen:bundleId
    entity(var:quote, [prov:value='var:value'])
    entity(var:author, [prov:type='prov:Person', foaf:name='var:name'])
    wasAttributedTo(var:quote,var:author)
    entity(var:institution)
    actedOnBehalfOf(var:author,var:institution,-)
  endBundle

endDocument

2.2 Template Instantiation: Cartesian Products

We are now ready to instantiate the template. To the previous bindings, we want to add two possible values for the variable var:institution. For the purpose of this example, we also consider two quotes ex:quote1 and ex:quote2 authored by Paul and Luc.

var:quote ex:quote1
ex:quote2
var:author http://orcid.org/0000-0002-3494-120X
http://orcid.org/0000-0003-0183-6910
var:name “Luc Moreau”
“Paul Groth”
var:institution http://www.soton.ac.uk/
http://labs.elsevier.com/

The resulting expansion is displayed in the following figure. We see that for the expression actedOnBehalfOf(var:author,var:institution,-), the template expansion algorithm considers all possible values of var:author and all possible values of var:institution, and created an instantiated association for each possible pair author/institution. In other words, by default, the template expansion considers the cartesian product of sets of values for the different variables in a relation.

Template Instantiation with Organisations (Cartesian Product)

Template Instantiation with Organisations (Cartesian Product)

2.3 Template With Variable Synchronization

The expansion is not quite right. While we are fine with each quote being attributed to both Paul and Luc, the former is affiliated with Elsevier, whereas the later is affiliated with Southampton. Thus, we wish to constrain the instantiation of actedOnBehalfOf(var:author,var:institution,-), so that don’t form the cartesian product of all possibilities. Instead, we want the values of var:author and var:institution to be enumerated in lockstep; in other words, a change of value for one should be synchronized with a change of value for the other.

This requires a simple modification to the template. We see here that an attribute is added to the var:institution entity.

Template With Variable Synchronization

Template With Variable Synchronization

It is made explicit in line 12, where the attribute [tmpl:linked='var:author'] explicitly synchronizes the value of var:institution with that of var:author. Note that this synchronization is for the whole template, not just for the relation that occurs in line 13.

document

  prefix var <http://openprovenance.org/var#>
  prefix vargen <http://openprovenance.org/vargen#>
  prefix tmpl <http://openprovenance.org/tmpl#>
  prefix foaf <http://xmlns.com/foaf/0.1/>
  
  bundle vargen:bundleId
    entity(var:quote, [prov:value='var:value'])
    entity(var:author, [prov:type='prov:Person', foaf:name='var:name'])
    wasAttributedTo(var:quote,var:author)
    entity(var:institution, [tmpl:linked='var:author'])
    actedOnBehalfOf(var:author,var:institution,-)
  endBundle

endDocument

2.3 Template Instantiation: Single Institution per Author

We are now again ready to instantiate the template. The resulting expanded document occurs in the figure below. We see that affiliation is now properly expressed.

Template Instantiation: Single institution per Author

Template Instantiation: Single institution per Author

3. Conclusions

As we said in Part 1, PROV-Template is easy to work with, it just requires provconvert to be installed. This notion of template supports the idea of a provenance template management system, in which all the provenance is generated by means of templates, which possibly can also evolve over time. By decoupling the generation of provenance from the logging of values, we observed a number of benefits:

  • It allows us to fine tune the provenance, independently of the application.
  • It permits us to keep the code to generate the provenance separate from the application itself: this is very convenient when the application programmer is not familiar with provenance, or there is no toolkit available to generate provenance from the application.
  • It allows us to adopt a more conceptual approach to provenance, thinking of “provenance schemas” rather than instances.

For feedback or comments on this tutorial, please raise an issue on the ProvToolbox issue tracket at https://github.com/lucmoreau/ProvToolbox/issues/.

Thanks to co-authors Dong and Danius. Heather has been using it in Smart Society’s SmartShare application.

Advertisements

3 thoughts on “ProvToolbox Tutorial 4: Templates for Provenance (part 2)

  1. Pingback: ProvToolbox Tutorial 4: Templates for Provenance (part 1) | Luc's Blog

  2. Pingback: What is in ProvToolbox 0.7.2? | Luc's Blog

  3. Pingback: What is in ProvToolbox 0.7.3? | Luc's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s