This blog post is the second part of the introduction to provenance templates.
The tutorial is standalone and a zip archive can be downloaded from the following URL: http://search.maven.org/remotecontent?filepath=org/openprovenance/prov/ProvToolbox-Tutorial4/0.7.0/ProvToolbox-Tutorial4-0.7.0-src.zip. The tutorial can also be found on the ProvToolbox project on GitHub.
The tutorial assumes that
provconvert has been installed and is available in the execution path. (See http://lucmoreau.github.io/ProvToolbox/ for installation instructions.) The tutorial relies on a
Makefile and can simply be run by calling:
2. Further Examples of Templates
We continue out introduction to PROV-Template by some examples.
2.1 Another Template: Quotes, Authors and Their Organisations
Our initial template could deal with attribution of quotes to authors. It is useful to be able to talk about author’s organizations. PROV offers the notion of delegation, by which we can relate an author agent to the organization agent, on behalf of which they acted. This is represented with the following template, where the link marked “del” represent the delegation association.
The only differences in the template occur in line 11, where a new variable
var:institution is introduced for the organization, and in line 12, where a delegation association is expressed with the term
document prefix var <http://openprovenance.org/var#> prefix vargen <http://openprovenance.org/vargen#> prefix tmpl <http://openprovenance.org/tmpl#> prefix foaf <http://xmlns.com/foaf/0.1/> bundle vargen:bundleId entity(var:quote, [prov:value='var:value']) entity(var:author, [prov:type='prov:Person', foaf:name='var:name']) wasAttributedTo(var:quote,var:author) entity(var:institution) actedOnBehalfOf(var:author,var:institution,-) endBundle endDocument
2.2 Template Instantiation: Cartesian Products
We are now ready to instantiate the template. To the previous bindings, we want to add two possible values for the variable
var:institution. For the purpose of this example, we also consider two quotes
ex:quote2 authored by Paul and Luc.
The resulting expansion is displayed in the following figure. We see that for the expression
actedOnBehalfOf(var:author,var:institution,-), the template expansion algorithm considers all possible values of var:author and all possible values of var:institution, and created an instantiated association for each possible pair author/institution. In other words, by default, the template expansion considers the cartesian product of sets of values for the different variables in a relation.
2.3 Template With Variable Synchronization
The expansion is not quite right. While we are fine with each quote being attributed to both Paul and Luc, the former is affiliated with Elsevier, whereas the later is affiliated with Southampton. Thus, we wish to constrain the instantiation of
actedOnBehalfOf(var:author,var:institution,-), so that don’t form the cartesian product of all possibilities. Instead, we want the values of
var:institution to be enumerated in lockstep; in other words, a change of value for one should be synchronized with a change of value for the other.
This requires a simple modification to the template. We see here that an attribute is added to the
It is made explicit in line 12, where the attribute
[tmpl:linked='var:author'] explicitly synchronizes the value of
var:institution with that of
var:author. Note that this synchronization is for the whole template, not just for the relation that occurs in line 13.
document prefix var <http://openprovenance.org/var#> prefix vargen <http://openprovenance.org/vargen#> prefix tmpl <http://openprovenance.org/tmpl#> prefix foaf <http://xmlns.com/foaf/0.1/> bundle vargen:bundleId entity(var:quote, [prov:value='var:value']) entity(var:author, [prov:type='prov:Person', foaf:name='var:name']) wasAttributedTo(var:quote,var:author) entity(var:institution, [tmpl:linked='var:author']) actedOnBehalfOf(var:author,var:institution,-) endBundle endDocument
2.3 Template Instantiation: Single Institution per Author
We are now again ready to instantiate the template. The resulting expanded document occurs in the figure below. We see that affiliation is now properly expressed.
As we said in Part 1, PROV-Template is easy to work with, it just requires
provconvert to be installed. This notion of template supports the idea of a provenance template management system, in which all the provenance is generated by means of templates, which possibly can also evolve over time. By decoupling the generation of provenance from the logging of values, we observed a number of benefits:
- It allows us to fine tune the provenance, independently of the application.
- It permits us to keep the code to generate the provenance separate from the application itself: this is very convenient when the application programmer is not familiar with provenance, or there is no toolkit available to generate provenance from the application.
- It allows us to adopt a more conceptual approach to provenance, thinking of “provenance schemas” rather than instances.
For feedback or comments on this tutorial, please raise an issue on the ProvToolbox issue tracket at https://github.com/lucmoreau/ProvToolbox/issues/.
Thanks to co-authors Dong and Danius. Heather has been using it in Smart Society’s SmartShare application.